ekalinin / robots.js

Parser for robots.txt for node.js
MIT License
66 stars 21 forks source link

Guard against infinite redirect loop #24

Closed bsander closed 7 years ago

bsander commented 7 years ago

Recently we encountered a situation where, due to a misconfigured server, a request to a robots.txt file resulted in an infinite redirect loop between the http and https protocols. This PR adds a guard for these kinds of situations. I followed Google's specification in deciding how to deal with this:

Redirects will generally be followed until a valid result can be found (or a loop is recognized). We will follow a limited number of redirect hops (RFC 1945 for HTTP/1.0 allows up to 5 hops) and then stop and treat it as a 404.

ekalinin commented 7 years ago

Great! Thanks!

bsander commented 7 years ago

Happy to contribute 😄. Would it be possible for you to publish a new release which contains this patch anytime soon? Thanks!

ekalinin commented 7 years ago

Sure. Already done:

bsander commented 7 years ago

Alwesome, thanks!