connorjoleary / DeepCite

Traversing links to find the deep source of information
GNU General Public License v3.0
69 stars 7 forks source link

When requesting a website, should check for status code #97

Open connorjoleary opened 3 years ago

connorjoleary commented 3 years ago

Is your feature request related to a problem? Please describe. When you parse the results of nested websites the code currently doesn't take into account if the response was something like 404, which means that we may see an error page, but treat is as if the website was returned as expected.

Describe the solution you'd like When requesting a website, look for a status 200, and if it is something else, don't parse that page.

Describe alternatives you've considered Give special results for specific codes (404, 403, ...).

Additional context https://www.reddit.com/r/todayilearned/comments/nz6hl7/til_the_banana_plant_is_a_herb_distantly_related/