FerrahWolfeh / imageboard-downloader-rs

Cli utility to bulk download content from popular imageboard sites
MIT License
10 stars 1 forks source link

Download e621 pools with correct page order #4

Closed videah closed 1 year ago

videah commented 1 year ago

Hello! I'm not sure if this is out of scope for this project since it's designed to work with many imageboards, but I'm looking for a tool that can download comics (pools) from e621 (and zip it up as a .cbz) with the correct page order.

Currently trying this with pool:<id> as the tag will download a pool but all of the pages will be out of order and it's clearly not meant for this use case.

Thank you for making this tool regardless, I'll find it very useful!

FerrahWolfeh commented 1 year ago

Heya, I'm glad you're liking this utility.

I took a look at e6's API documentation this morning and yeah, it's possible to add this functionality. (at least for e621. Gotta take a look at danbooru to see if it has a similar API)

Meanwhile, you can use the --id and --cbz flags together for the utility to try ordering the pages as best as possible, assuming each page was posted one after the other.

FerrahWolfeh commented 1 year ago

Upon closer inspection of both e621 and danbooru APIs regarding pools, it seems that there is no efficient way of fetching all posts of a pool without essentially abusing the API and/or being rate-limited. I'm afraid that there's nothing I can do for now about correctly ordered pool downloading.

I also saw that I can make a special search trick in e621 to get all posts from a pool easily, however, it is pretty unreliable to do such thing with very big pools or in danbooru due to limits on how many tags can be searched.

My best guess is currently using the method I described above:

Meanwhile, you can use the --id and --cbz flags together for the utility to try ordering the pages as best as possible, assuming each page was posted one after the other.

FerrahWolfeh commented 1 year ago

However I may come with another crazy idea to efficiently download it, so don't take it as a wontfix, just will need some time

videah commented 1 year ago

I appreciate the quick response, I've tried using post ID's but my reader expects flat sequential file names and doesn't like the big spaced out numbers

I think a good compromise if the API limitations can't be worked around would be to add a flag that orders the file names like 001.png, 002.png, 003.png etc. based on the order of the ID's even if it would fall apart when any of the pages are uploaded out of order

FerrahWolfeh commented 1 year ago

I managed to hack something that seems to work ok for e6 and danbooru. give it a try from the latest main

videah commented 1 year ago

Not sure if I'm using it right but imageboard_downloader --cbz --pool 6837 -i e621 -o . seems to result in downloading stuff from e621's front page forever rather than from the pool

FerrahWolfeh commented 1 year ago

Now it's fixed in tag 1.5.1 Just forgot to... enable the pool path to e621

videah commented 1 year ago

Seems to work now! A new issues though is the --cbz flag produces no output after downloading, the images download in sequence otherwise

FerrahWolfeh commented 1 year ago

Yeah, it's currently by design that the --cbz mode doesn't print the full results of the download. The default and cbz modes share a lot of code, but the latter is a little trickier to do some things still.

Since the pools download is already working fine by now, I'm closing the issue. Feel free to open another one in case of a bug or something

videah commented 1 year ago

Oh by no output I mean there is no .cbz file in the directory, it's just empty! Thank you for implementing this for me regardless, I will find it very useful

FerrahWolfeh commented 1 year ago

Oh, I see Could you please open another issue about this and then show me the command you're using? I know that cbz works quite strangely with -O and -o