Closed ewwink closed 6 months ago
similarity
allows to compute structural HTML similarity of two webpages, as described in this research: https://github.com/matiskay/html-similarity?tab=readme-ov-file#references
I think it's a much more solid approach than comparing titles
is there case where original and CF version have different title or original and other website has same title? sorry for my curiosity 😁
I would expect both to have the same title
I think we do not need it, why not just capture
<title>my site</title>
from the html using regex., but I'm maybe wrong.