akamhy / waybackpy

Wayback Machine API interface & a command-line tool
https://pypi.org/project/waybackpy/
MIT License
453 stars 32 forks source link

One question about the return time value #177

Closed Huo-Yuan closed 1 year ago

Huo-Yuan commented 1 year ago

For example, using oldest.datetime_timestamp, I can get the time 2021-01-01 15:51:57.

However, I do not know which time zone it belongs to?

BTW, the oldest time is the time when the webpage was first published or first captured

akamhy commented 1 year ago

time zone = UTC oldest time = First time the archive was captured, and not the day the webpage was published.

Huo-Yuan commented 1 year ago

Can I know the frequency of capture

akamhy commented 1 year ago

I'm not sure what you mean by frequency of capture, so I will give multiple answers.

The first answer) The frequency of capture(new archive) depends on the webpage, Google's homepage for example is archived 100s of times a day but an obscure webpage may only have a single library if at all. Frequency is proportional to the popularity of the webpage.

The second Answer) The number of URLs you can submit to be Archived to the Wayback Machine, using the Save Page Now features, is a max of 15 per minute.

The third Answer) Wayback Machine limits capture per webpage to 45 minutes. You need to wait 45 minutes between two successive capture for the same webpage. This time frame of 45 minute, may change in the future.

Huo-Yuan commented 1 year ago

Thanks a lot, that is exactly what I am looking for