internetarchive / dweb-mirror

Offline Internet Archive project
https://www-dweb-mirror.dev.archive.org/
GNU Affero General Public License v3.0
263 stars 27 forks source link

Crawl - server break on non int mtime #212

Open mitra42 opened 5 years ago

mitra42 commented 5 years ago

e.g. https://dweb.me/arc/archive.org/metadata/howwethink01dewegoog

500: invalid literal for int() with base 10: '1180974425.0'
mitra42 commented 5 years ago

Note - this problem is in the python Also seen on astatementlette00cologoog markspotteryand00philgoog

mitra42 commented 5 years ago

127.0.0.1 - - [16/Jul/2019 23:20:20] "GET /info HTTP/1.0" 200 - DEBUG:urllib3.connectionpool:https://archive.org:443 "GET /metadata/howwethink01dewegoog HTTP/1.1" 200 None ERROR:root:Sending Unexpected Error 500: Traceback (most recent call last): File "/usr/local/dweb-gateway/python/ServerBase.py", line 132, in _dispatch res = func(*args, kwargs) File "/usr/local/dweb-gateway/python/ServerBase.py", line 293, in wrapped result = func(*args, *kwargs) File "/usr/local/dweb-gateway/python/ServerGateway.py", line 225, in arc obj = ArchiveItem.new("archiveid", args, kwargs) File "/usr/local/dweb-gateway/python/Archive.py", line 207, in new verbose=verbose) for f in obj._metadata.get("files",[])] File "/usr/local/dweb-gateway/python/Archive.py", line 207, in verbose=verbose) for f in obj._metadata.get("files",[])] File "/usr/local/dweb-gateway/python/Archive.py", line 429, in new if obj.inTorrent() and obj.parent._metadata["metadata"].get("magnetlink"): File "/usr/local/dweb-gateway/python/Archive.py", line 500, in inTorrent if (not self._metadata.get("mtime")) or (self.parent.torrenttime() < int(self._metadata["mtime"])): ValueError: invalid literal for int() with base 10: '1180974425.0' 127.0.0.1 - - [16/Jul/2019 23:20:20] code 500, message invalid literal for int() with base 10: '1180974425.0

mitra42 commented 5 years ago

Fix this when remove dependency on dweb.me and fetch metadata independently. #242