mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
10.68k stars 881 forks source link

Kemono revisions with archive file #5695

Closed Hawker2 closed 2 weeks ago

Hawker2 commented 3 weeks ago

I'm running into some issues downloading revisions from Kemono when using an archive file. For instance, take the following link (NSFW: https://kemono.su/fanbox/user/10415882/post/4006007) and the following config:

{
    "extractor":
    {
        "skip": true,
        "path-restrict": "windows",
        "path-strip": "windows",
        "sleep": 0,
        "user-agent": "Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0",
        "base-directory": "~/Desktop/",

        "kemonoparty": {
            "metadata": true,
            "revisions": "unique",
            "directory": ["{category}", "{service} {user} {username}", "{id} {title[:60]}"],
            "archive": "~/Desktop/{category}/{service} {user} {username}/archive.db",
            "filename": "{id}_{num:>02}_{filename[:60]}.{extension}",
            "postprocessors": [
                {
                    "name": "metadata",
                    "event": "post",
                    "filename": "{id} {title}.txt",

                    "#": "write text content and external URLs",
                    "mode": "custom",
                    "format": "{content}\n{embed[url]:?/\n/}",

                    "#": "only write file if there is an external link present",
                    "filter": "embed.get('url') or re.search(r'(?i)(gigafile|xgf|1drv|mediafire|mega|google|drive)', content)"
                }
            ]
        }
    },

    "output":
    {
        "skip": false
    }
}

This only appears to download the first revision, and it appears to be an interaction with the download archive. If the archive is removed, then the revisions download (specifically zip archives later removed), but a dozen duplicate images also pour in (which is odd, since there are only two revisions).

Is the archive working as intended with revisions? I feel like I should see the output from the first run with an archive, but with the zip files from the earlier revision.

mikf commented 3 weeks ago

The default archive-format value for kemono does not account for revisions ("{service}_{user}_{id}_{num}"). It produces the same archive ID value for each subsequent revision and therefore downloads only the files of the first one.

You need to set it to a custom value that includes at least one of the revision_... values. For example "{service}_{user}_{id}_{num}_{revision_index}".

taskhawk commented 2 weeks ago

Including {hash} as one of the values in archive-format also works, btw.