edcorcoran / internet_archive

0 stars 0 forks source link

Handle 10K search result limit better #2

Open edcorcoran opened 1 week ago

edcorcoran commented 1 week ago

I think the best solution here is to just search for a single day of results and make that a loop. So I can set a date range, make a search for each date in that range and then iterate through each page of that search result and each item on each page.

edcorcoran commented 1 week ago

I could also migrate to the scraping API. It's described here: https://archive.org/help/aboutsearch.htm