taganaka / polipus

Polipus: distributed and scalable web-crawler framework
MIT License
92 stars 32 forks source link

Anchor links converted to %23 causing 404 errors #57

Open ABrisset opened 9 years ago

ABrisset commented 9 years ago

When anchor links are found during the crawl (i.e http://www.example.com/abc.html#foo), they are encoded : the anchor tag is replaced with the escaped character %23, which causes the page to respond with a 404 error code. Could you fix it please ? The idea is to prevent Polipus from escaping this character "#". Many thanks for your job on Polipus gem.