internetarchive / dweb-mirror

Offline Internet Archive project
https://www-dweb-mirror.dev.archive.org/
GNU Affero General Public License v3.0
263 stars 27 forks source link

crawl 504 (Gateway Time-out) on gateway neither retried nor flagged as error #248

Closed mitra42 closed 4 years ago

mitra42 commented 5 years ago

STR

mv /Volumes/x/archiveorg /Volumes/x/archiveorg-
mkdir /Volumes/x/archiveorg
internetarchive -sc 

Watch for 504 Notice that no error reported at end of crawl, notice also that query isn't retried

mitra42 commented 5 years ago

QUestion

Decision in DAC.ArchiveItem._fetch_query not to report errors but work on existing. Need at least an option to request this, and pass this up to crawler.

mitra42 commented 4 years ago

Havent seen this in a while, reopen if reqd