mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
10.76k stars 885 forks source link

[kemono.party] downloading text content of posts #1760

Closed Doofy420 closed 2 years ago

Doofy420 commented 2 years ago

I'm trying to do what's described in #1278, but instead would like to have everything in a single txt/json file while also not downloading any file (text only, for link scraping) Is there any way to do this? The solution in the previously mentioned issue does not work well when downloading multiple posts to a single directory

mikf commented 2 years ago

to have everything in a single txt/json file

Not currently possible, but you can just cat (or whatever Windows equivalent there is) all the single files into one, or use grep on all of them at once, etc.

while also not downloading any file

--no-download

Doofy420 commented 2 years ago

I see, thanks for clarifying. Is there any way to segment the filenames for the txt files then, so I can keep them in one directory? I download everything to a single folder per creator

mikf commented 2 years ago

The metadata post processor has a directory option, which can be an absolute path to write all files into a single directory, or you override the general directory option and use -d to specify the target directory that way:

$ gallery-dl --no-download -o directory= -d /path/to/dir <URLs>`.
Doofy420 commented 2 years ago

I have mine configured like this

            "metadata": true,
            "filename": "{id}.{filename}.{extension}",
            "directory":  ["{category}", "{service}", "{username}"],

The only problem I'm having is that only 1 text post is saved in the text file (the very first post in a creator's page) I suppose this is because the solution in #1278 was designed to save the text files in a different directory per post?

mikf commented 2 years ago

You need to adjust the filenames generated by the metadata post processor. Not a static info.json like in https://github.com/mikf/gallery-dl/issues/1278#issuecomment-768350318 but something like {id}.json that is unique per post.

            "postprocessors": [
                {
                    "name": "metadata",
                    "event": "post",
                    "filename": "{id}.json"
                }
            ]
Doofy420 commented 2 years ago

It works! It would be nice to have the option to have all the text posts in one file in the get go but this works OK too. Thank you very much.