Closed int-ua closed 5 years ago
I forgot to put a note on why I'd not used head earlier. I remembered when I suddenly was finding bunches of "bad" links that were actually "good" due to anti-crawling server behavior.
This actually generates a lot of false positives.
I think this can be made a parameter, for those who know they won't experience the anti-crawling issue that shows up when using requests.head vs. requests.get
this was done in 404e941caab1d266f
Fixes #4 Fixes issue with "https://" being detected as URL