mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.86k stars 975 forks source link

Smart abort? #617

Open rEnr3n opened 4 years ago

rEnr3n commented 4 years ago
$ gallery-dl --version
1.13.0

I am downloading from an instagram account like gallery-dl https://www.instagram.com/jennierubyjane/ and I run the command on schedule.

I would like another value(?) for extractor.*.skip where the extractor would abort if all media of the current post were already downloaded.

What I'm asking is similar to "abort" but that value aborts on a per-file basis. That is going to skip some files that were not downloaded yet if it was interrupted in the previous run. I wanted it to abort on a per-post basis.

Here's how it looks with the current "abort":

$ cat config.txt
{
    "extractor": {
        "skip": "abort"
    }
}
$ gallery-dl --ignore-config -c config.txt https://www.instagram.com/jennierubyjane
./gallery-dl/instagram/jennierubyjane/2247359936384083873_2247359933070648226.jpg
./gallery-dl/instagram/jennierubyjane/2247359936384083873_2247359933037000739.jpg
^C
KeyboardInterrupt
$ rm ./gallery-dl/instagram/jennierubyjane/2247359936384083873_2247359933037000739.jpg
$ gallery-dl --ignore-config -c config.txt https://www.instagram.com/jennierubyjane
./gallery-dl/instagram/jennierubyjane/2247359936384083873_2247359933070648226.jpg
Hrxn commented 4 years ago

~Do you use the archive file setting?~

Edit: That config.txt blurb is your whole config setting, I assume?

What I'm asking is similar to "abort" but that value aborts on a per-file basis.

Yeah, that's the current behaviour, I'm not sure if I can follow..

So do I get this right, the issue is with posts multiple-image/"carousel" posts on IG only?

rEnr3n commented 4 years ago

That config.txt blurb is your whole config setting, I assume?

Yes. Just for testing.

So do I get this right, the issue is with posts multiple-image/"carousel" posts on IG only?

This can apply to other sites as well if they allow each post to contain more than one media.

Example:

instagram
|-- username
    |-- postid1_01.jpg # Post 1
    |-- postid1_02.jpg # Post 1
    |-- postid2_01.jpg # Post 2
    |-- postid2_02.jpg # Post 2
    |-- postid3_01.jpg # Post 3
    |-- postid3_02.jpg # Post 3

Using the current "abort" value: If I stop after postid1_01.jpg, postid1_02.jpg and the rest would not be downloaded. I wanted it to download the rest since "Post 1" was not completely downloaded. Only the first image was.

mikf commented 4 years ago

You can use "skip": "abort:N", where N is the number of consecutive skips before it stops (See extractor.skip) Should be good enough if you choose an N between 10-20, but this is obviously going to stop prematurely if there are more than N files in a single post.

sserenade commented 4 years ago

I think I might have a similar use case. I'm using skip with abort:5 for manga chapters, but this results in the first 5 pages of a chapter being downloaded, aborting, and then moving to the next chapter. The behavior that I'd like to be able to configure is if 5 chapters were skipped, then abort the entire series.

kattjevfel commented 3 years ago

I think I'm after the same feature as @sserenade, I'd love to be able to skip after X chapters. I have "chapter-range": "1-3" set along with "chapter-reverse": true, but realised this results in new mangas only getting the latest 3 chapters.

The above also would not help if someone batch uploaded more than 3 chapters at once (or I've been offline for a bit too long)