mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.39k stars 930 forks source link

[Instagram] Support for ignoring pinned posts for extractor.*.skip #2752

Open Infinitay opened 2 years ago

Infinitay commented 2 years ago

Instagram recently introduced a feature that allows users to pin certain posts to their profiles. In my opinion this interferes directly with the use of the skip extractor option. Reason being is because current behavior will see the latest posts are those pinned posts. Depending on how many pinned media there is and your skip extractor option, the rest of the posts would be skipped.

For example, consider the following Instagram, https://www.instagram.com/h1ghrmusic/. At the time of this issue they have three pinned posts. Post one has one media. Post two has eight pieces of media. Post three has two pieces of media.

gallery-dl -u <username> -p <password> \
    --download-archive gallery-dl_instagram_archive \
    -o extractor.instagram.archive-prefix="{category}] {username}({owner_id})" \
    -o extractor.instagram.archive-format=" {subcategory}_{shortcode}" \
    -o extractor.instagram.skip="abort:2" -o extractor.instagram.include="posts" \
    --mtime-from-date --verbose \
    https://www.instagram.com/h1ghrmusic/

What I did was run the following command above once and stop it as soon as it downloaded the first two posts or eight pieces of media. At this point I terminated the command forcefully.

I attempted to run the same command above one more time, again with -o extractor.instagram.skip="abort:2". It automatically aborted as it should have as it found two pieces of media already in the archive. I changed the command to -o extractor.instagram.skip="abort:3" just to confirm and it also aborted. When I changed the command to -o extractor.instagram.skip="abort:10" it started downloading the remaining images.

I would like to request either a new extractor option or updated skip behavior for the Instagram module to ignore pinned posts when it comes to argument(s) passed in to skip extractor option.

afterdelight commented 2 years ago

why did you want to skip a pinned post?

Infinitay commented 2 years ago

why did you want to skip a pinned post?

I am realizing now I did not clarify that in the logic this new extractor option would still download pinned posts should they not be in the archive. However, if the current media that is in a pinned post is within the archive, don't count it towards the skip extractor such as in the case of abort:n

I hope that clears up any confusion.

afterdelight commented 2 years ago

i still dunt undestand

mikf commented 2 years ago

This is not entirely what you requested (integrating something like this into the whole skip/archive logic is complicated and error prone), but it at least allows you to filter by pinned status: https://github.com/mikf/gallery-dl/commit/467a2a4d35433c6f1aafc173c6b98f47860da13d.

--filter "not pinned" to ignore any pinned posts, --filter "pinned or abort()" to only download the first few pinned posts.

Infinitay commented 2 years ago

It might not be what I requested but I appreciate it nonetheless. I'll go ahead and modify my current script to re-run the command but this time passing in --filter "not pinned". Probably not efficient and might result in locks/bot checks faster, but better than nothing.

Thanks again

Also, do you have a certain release schedule for chocolatey pushes?

mikf commented 2 years ago

I do not have any control over chocolatey releases or really any listed on https://repology.org/project/gallery-dl/versions. I am only responsible for GitHub and Snap releases, everything else is handled by someone other than me.

The one in charge of chocolatey releases seems to be Starz0r here on GitHub. Maybe ping him?

Hrxn commented 2 years ago

You mean GitHub, Snap, and PyPI (the official Python Package Index)?