bbcarchdev / anansi

A Linked Open Data Web crawler
https://bbcarchdev.github.io/anansi/
Apache License 2.0
0 stars 0 forks source link

Handle 307 (Temporary Redirect) and 308 (Permanent Redirect) properly #70

Open nevali opened 7 years ago

nevali commented 7 years ago

Internal tracking: RESDATA-1179

nickshanks commented 6 years ago

I came to this issue after spotting the code if(status > 300 && status < 304) at https://github.com/bbcarchdev/anansi/blob/develop/libspider/processors/rdf.c#L162 and the 2xx check a few lines later.

I wanted to find out why other 3xx status codes were not included and why attempts are made to process response with codes like 202 and 206. Also, 203 should probably be rejected for integrity reasons.

In my opinion, processing would be more predictable if the first check was if(status >= 300 && status < 400) and the second was if(status != 200).