mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.4k stars 931 forks source link

reddit subreddit download breaking into subfolders? I turned that off? #2202

Closed bschollnick closed 1 year ago

bschollnick commented 2 years ago

Folks,

I'm sure this is something that I've missed, but I would appreciate any help you can give...

I'm trying to download from a subreddit, and that works, but it's creating subfolders for imgur, red gifs, etc... I don't want that. I thought I solved that in this configuration file, but it's not acting the way I would expect from the docs?

{ "extractor": {

    "#": "replace invalid path characters with unicode alternatives",
    "path-restrict": {
        "\\": "⧹",
        "/" : "⧸",
        "|" : "│",
        ":" : "꞉",
        "*" : "∗",
        "?" : "?",
        "\"": "″",
        "<" : "﹤",
        ">" : "﹥"
    },
     "extension-map": {
        "jpeg": "jpg",
        "jpe" : "jpg",
        "jfif": "jpg",
        "jif" : "jpg",
        "jfi" : "jpg"
    },
    "reddit":
    {
         "#": "only spawn child extractors for links to specific sites",
        "whitelist": ["imgur", "redgifs", "gfycat"],

        "#": "put files from child extractors into the reddit directory",
        "parent-directory": true,

        "parent-metadata":"_reddit",
        "mature": true,
        "directory": [""],
        "filename": "{title}—{filename}.{extension}",
        "flat": true,
        "wait-min": 0
        "videos":true,
    }
}

}

The parent-directory, and parent-metadata are suppose to take any other extractor and handle them as if it was reddit? So since I'm not putting reddit content in a subfolder, they shouldn't either?

Yet the imgur, red gifs, gfycat subfolders are still being populated and created?

Hrxn commented 2 years ago

I think you're almost there..

The Reddit extractor spawns other extractors for sites it recognizes, so you have to pass on metadata.

For the reddit part in the config, this is what matters:

            "parent-directory": true,
            "parent-metadata": "_reddit",

You can re-use the metadata in any spawned extractor then, like imgur for example:

        "imgur":
        {
            "image":
            {
                "archive-format": "{id}",
                "directory": {
                    "'_reddit' in locals() and extension in ('mp4', 'webm')": ["+Clips"],
                    "'_reddit' in locals() and extension in ('gif', 'apng')": ["+Gifs"],
                    "'_reddit' in locals()"                                 : [],
                    "extension in ('mp4', 'webm')"                          : ["Imgur", "Anims", "{bkey}", "{ckey}", "{tkey}", "{skey}", "{mkey}"],
                    ""                                                      : ["Imgur", "Pics", "{bkey}", "{ckey}", "{tkey}", "{skey}", "{mkey}"]
                },

Ignore the "image" part here, that means this config is only for the image sub-extractor of imgur, but you can obviously specify the same settings one level above, i.e. for imgur in general.

The trick is to check for reddit metadata, as in the first three lines of this conditional "directory" setting block. Simplest example is the third line here.. It evaluates the expression "'_reddit' in locals()" , which turns out to be true if this imgur invocation is coming from the reddit extractor, and then sets the directory to [] , i.e. off.

github-account1111 commented 2 years ago

Sorry to hijack but this is a related question: can the same be done with the postprocessor metadata fields? This:

"reddit": {
  "parent-directory": true,
  "parent-metadata": "parent"
},
"imgur": {
  "postprocessors": {
    "'parent' in locals()": [
        {
        "name": "exec",
        "command": [
          "exiftool",
          "-overwrite_original",
          "-title=https://www.reddit.com{parent[permalink]}",
          "{parent[_path][4:]}"
        ]
      }
    ]
  }
}

outputs [postprocessor][warning] module ''parent' in locals()' not found. directory and filename work beautifully using hrxn's example.

rEnr3n commented 2 years ago

There are two parameters I needed for this: parent-metadata and category-transfer.

config:

{
    "extractor": {
        "reddit": {
            "parent-metadata": true,
            "category-transfer": true,
            "directory": ["{category}", "{subreddit}", "[{id}] {title}"],
            "filename": "{filename}.{extension}"
        }
    }
}
$ gallery-dl --ignore-config -c config https://www.reddit.com/r/twice/comments/tx2v96/digital_art_twice_artwork_on_the_final_rplace/
./gallery-dl/reddit/twice/[tx2v96] [Digital Art] TWICE Artwork on the Final r_place Canvas/eKnJAz7.png
./gallery-dl/reddit/twice/[tx2v96] [Digital Art] TWICE Artwork on the Final r_place Canvas/n4iNMTf.png
$ gallery-dl --ignore-config -c config https://www.reddit.com/gallery/tx2v96
./gallery-dl/reddit/twice/[tx2v96] [Digital Art] TWICE Artwork on the Final r_place Canvas/eKnJAz7.png
./gallery-dl/reddit/twice/[tx2v96] [Digital Art] TWICE Artwork on the Final r_place Canvas/n4iNMTf.png
$ gallery-dl --ignore-config -c config https://www.reddit.com/r/twice/comments/txjp35/220406_beautiful_nayeon/
./gallery-dl/reddit/twice/[txjp35] 220406 - Beautiful Nayeon/None.mp4
$ gallery-dl --ignore-config -c config https://www.reddit.com/gallery/txjp35
./gallery-dl/reddit/twice/[txjp35] 220406 - Beautiful Nayeon/None.mp4

Hope that helps.