qri-io / walk

Webcrawler/sitemapper
GNU General Public License v3.0
6 stars 2 forks source link

Refactor Redirect Handling & Add Tests #9

Open b5 opened 6 years ago

b5 commented 6 years ago

this came out of #2

Our redirect logic is currently untested. We've already identified a potential bug:

Will w.Completed() in redirect handler going to potentially cause the coordinator to stop early? (If there are no more items in the queue, this marks the URL the coordinator queued as done, even though there are more requests in the redirect chain here to be made.)

We intended to modify the logic of how redirects work to be a slice of strings that traces the entire redirect path, creating a resource for each stop along the way. We should implement this & add tests, being sure to test the case of redirect count exceeding too many redirects, and having a graceful failure state.