workeffortwaste / horseman

The detailed update and issue repository for the Horseman crawler.
https://gethorseman.app/
16 stars 0 forks source link

Crawler removes www from URLs #91

Closed chrishaensel closed 1 year ago

chrishaensel commented 1 year ago

I have a set of URLs starting with https://www..... Horseman is removing the "www" from the URL, only to be then redirected by the server to the "www" version - and then reporting a 301 redirect where there would be none if the "www" had been left in place.

workeffortwaste commented 1 year ago

That seems odd. Want to slack me some examples so I can check it out?

chrishaensel commented 1 year ago

Sent some URLs via slack. Forgot to mention that this happened during a "list crawl".

workeffortwaste commented 1 year ago

Found the cause. 👍

Screenshot_20220904-213608.png

workeffortwaste commented 1 year ago

Fixed.