mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.88k stars 976 forks source link

[AO3/ArchiveOfOurOwn] [Feature Request] Add the ability to download from the site in all its formats pdf,epub,html etc #6013

Closed AtomicTEM closed 1 month ago

AtomicTEM commented 3 months ago

There isn't any good option to simply one click download all fics from a single writer or genre/tag.

Gallery-dl has been extremely effective with other sites, so I would love to also have AO3 supported please.

ClosedPort22 commented 1 month ago

Looks like we've got some duplication of efforts here... https://github.com/ClosedPort22/gallery-dl-googledrive/blob/main/extractor/archiveofourown.py

@mikf If you find anything useful, feel free to copy the code and adapt it to gallery-dl's coding conventions. The whole repo is licensed under the MIT license.

ClosedPort22 commented 1 month ago

I would recommend extracting the updated_at parameter from download URLs and adding that to the default archive format. From my experience, updated_at is guaranteed to change when the file content changes, even if it's just from the author changing their pen name.

Or you could use a combination of Updated (in-progress works only) and Completed (completed works only) to only detect changes to the story text.

WarmWelcome commented 1 month ago

Would this implementation support epub? This would be a fantastic feature since I don't believe there is any robust solution for backing up AO3. Other fanfiction sites, such as wattpad, fanfiction, fimfiction, quotev, or any of the others have no robust backup solution either, and an implementation in gdl would be a fantastic addition.

AtomicTEM commented 1 month ago

Would this implementation support epub? This would be a fantastic feature since I don't believe there is any robust solution for backing up AO3. Other fanfiction sites, such as wattpad, fanfiction, fimfiction, quotev, or any of the others have no robust backup solution either, and an implementation in gdl would be a fantastic addition.

It does Simply put the following this in your config. defulat is pdf "formats": "epub",

ClosedPort22 commented 1 month ago

Other fanfiction sites, such as wattpad, fanfiction, fimfiction, quotev, or any of the others have no robust backup solution either, and an implementation in gdl would be a fantastic addition.

For fanfiction.net, you can check out https://github.com/JimmXinu/FanFicFare. FFN is notoriously hostile toward scrapers and downloaders and uses Cloudflare anti-bot to prevent story text from being scraped, and as a result the developer has been recommending people to use the "browser cache" feature instead. That feature relies on user interaction so mass archival isn't really feasible I'm afraid.

WarmWelcome commented 1 month ago

check out https://github.com/JimmXinu/FanFicFare

How in the world haven't I seen this before? Appears in no searches, never seen any recommendations for it... Well, at least I have it now. Thank you kindly for it :)