internetarchive / dweb-mirror

Offline Internet Archive project
https://www-dweb-mirror.dev.archive.org/
GNU Affero General Public License v3.0
261 stars 27 forks source link

Crawling fails - leaving tasks #320

Open mitra42 opened 4 years ago

mitra42 commented 4 years ago

Crawling fails on two of the the edge cases in https://github.com/internetarchive/dweb-archive/issues/120. In both cases, presence in the crawl causes left over tasks

mitra42 commented 4 years ago

I ntoice errors also in browser for https://www-dweb-mirror.dev.archive.org/services/img/bdc-W3PD1123 and https://www-dweb-mirror.dev.archive.org/services/img/JournetsinPersiaAndKurdistanVolII

mitra42 commented 4 years ago

Also note. both appear to be typos i.e. non existant items, but still shouldnt fail crawl