wordtreefoundation / archdown

Command-line utility to download books from archive.org using archivist-client
MIT License
3 stars 1 forks source link

Downloading collections #2

Open ylluminate opened 5 years ago

ylluminate commented 5 years ago

I'd like to use this tool to download entire member upload collections for datasets. Ie, https://archive.org/details/someuser. Does this tool already support this functionality? From some of your examples it seems as though it might, but it's lacking in clarity as to how to do this.

canadaduane commented 5 years ago

Do you know what the filter would look like, using archive.org's syntax, to hone in on the collection you want? Archdown uses archivist-client which supports pretty much any filter (I think). See https://github.com/wordtreefoundation/archivist-client/blob/62fff6430d77d30602c7ec9dd1ffb1352a6aba78/lib/archivist/client/filters.rb

ylluminate commented 5 years ago

So you're saying it's not as simple as saying grab everything hanging off of this URL, but rather that the API must be used to "filter" content in order to pull in an uploader's uploads... right?

canadaduane commented 5 years ago

Yes, as far as I understand it, the URLs at archive.org don't map directly to their query API. So the first step is to figure out how they are querying/filtering to get that result, and then replicate it using something like Archdown.

On Sun, Nov 3, 2019, 12:23 AM ylluminate notifications@github.com wrote:

So you're saying it's not as simple as saying grab everything hanging off of this URL, but rather that the API must be used to "filter" content in order to pull in an uploader's uploads... right?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/wordtreefoundation/archdown/issues/2?email_source=notifications&email_token=AAAABAI52QE4IZ3GYJMVI3DQRZN35A5CNFSM4JIA2RU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC5L2UA#issuecomment-549109072, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAABAKBRUEUS7CAAU57SA3QRZN35ANCNFSM4JIA2RUQ .