Closed Klocohdonou closed 2 years ago
Yes that is clearly a problem which should be addressed and thanks for reporting its source (www.org) this explains this weird behaviour best, I've reported it in the past but couldn't find why nor reproduce it (cf #321)
should take care of #341 together also
Thanks for the fix! Glad this report helped.
Hello!
In the comments section of an online article I crawled, Hyphe found a link to
http://WWW.org
.Here's the comment (the permalink doesn't seem to work; but if you scroll down, it's the comment by Crackly Philippe from January 12, 2021). The link is on the author name.
Since there is no hostname in this URL, Hyphe created an entity matching the entire
.org
TLD (LRU prefixs:http|h:org|
), and many of the websites with a.org
TLD that Hyphe found while crawling were gathered into this entity.Would it be possible to prevent this behaviour from happening? Feel free to ask if you need additional information!
Thanks in advance, Kevin