thequbit / BarkingOwl

scalable web scraper framework for finding documents on websites.
GNU General Public License v3.0
19 stars 7 forks source link

Include what the error was with the url within the bad_urls list #41

Open thequbit opened 9 years ago

thequbit commented 9 years ago

It would be great to record why a link was marked as 'bad' within the _data['bad_urls'] list. This would make each entry in the list a dict of 'url' and 'error' rather than just a string.

There is at least one location, I think two places that _data['bad_urls'] is used. Will need to itterate through with some different code rather than just using "url in bad urls".