Closed dequeued0 closed 2 years ago
'Content-Type': 'text/html'
, most likely it was 50* error. Will try to include the error message.Header: {'Server': 'nginx/1.15.8', 'Date': 'Thu, 14 Jan 2021 22:13:08 GMT', 'Content-Type': 'text/html; charset=utf-8', 'Content-Length': '232', 'Connection': 'keep-alive', 'X-App-Server': 'wwwb-app52', 'X-ts': '404', 'X-Tr': '138505', 'X-RL': '0', 'X-NA': '0', 'X-Page-Cache': 'MISS', 'X-NID': 'Google'}
I don't know what all the keys represent in the header of failed request, I will try contacting internet archive for help.
rechecking later finds it
I'm gonna try fetching the newest archive before raising error, if difference in timestamp is less than 30 minutes will return the newest archive. According to IA, Wayback machine doesn't allow more than 1 archive per 30 minutes.
I'm gonna try fetching the newest archive before raising error, if difference in timestamp is less than 30 minutes will return the newest archive. According to IA, Wayback machine doesn't allow more than 1 archive per 30 minutes.
Sounds good. Perhaps a flag should be set on the archive object indicating that the archive is older than the save request?
Also note that the URLs I am archiving are very new so there is no previous archive.
The flag is cached_save
, if it is True then the archive was cached by wayback machine.
Use -
>>> import waybackpy
>>> url = "https://en.wikipedia.org/wiki/Multivariable_calculus"
>>> user_agent = "Mozilla/5.0 (Windows NT 5.1; rv:40.0) Gecko/20100101 Firefox/40.0"
>>> wayback = waybackpy.Url(url, user_agent)
>>> archive = wayback.save()
>>> archive.cached_save
True
True indicates cached archive.
I seem to get this error a lot of the time when saving the archive actually succeeded (rechecking later finds it). It seems like the error statement could be more specific about the failure and I'm not sure that the upgrade suggestion is helpful when the version is current.