hartleybrody / public-amazon-crawler

676 stars 227 forks source link

Requests fail, crawler not working. #6

Closed aaayushsingh closed 7 years ago

aaayushsingh commented 7 years ago
$ python crawler.py start
2017-10-16 13:01:04.988490: Seeding the URL frontier with subcategory URLs
2017-10-16 13:02:05.917383: WARNING: Request for https://www.amazon.in/s/ref=s9_acss_bw_cg_DressLBD_1a1_w?rh=i%3Aapparel%2Cn%3A1571271031%2Cn%3A%211571272031%2Cn%3A1953602031%2Cn%3A11400137031%2Cn%3A1968445031%2Cp_36%3A-79900%2Cp_98%3A10440597031 failed, trying again.
hartleybrody commented 7 years ago

There are a million reasons a request might not be working, likely something to do with your network connection between your machine and the proxy server, or the proxy server and amazon.

Either way, the code is handling the network error properly, by logging it and trying again. If no requests are coming back successfully, there's some issue with your particular setup.