mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.95k stars 975 forks source link

Deviantart: Download image descriptions #2319

Open FlashFried opened 2 years ago

FlashFried commented 2 years ago

This probably has been answered before but I'm having difficulty figuring out how to get image descriptions to also download.

AlttiRi commented 2 years ago

1.

Here is I wrote how to create gallery-dl.conf file, to set up custom file name, to bypass 429 Too Many Requests error, and to download watcher only artworks: https://github.com/mikf/gallery-dl/issues/2143#issuecomment-1002192200

2.

Then just add in gallery-dl.conf file in extractor.deviantart two fields (metadata, postprocessors):

"metadata": true,
"postprocessors": [{
    "name": "metadata",
    "mode": "custom",
    "filename": "[{category}] {author[username]}—{index}—{date:%Y.%m.%d}—{title}.html",
    "directory": "metadata",
    "extension": "html",
    "format": "<h1 style='display: inline'><a href='{url}'>{title}</a></h1> by <a href='https://www.deviantart.com/{username}'>{author[username]}</a><div><br></div><div class='content'>{description}</div><br><div><hr><div class='tags'>[\"{tags:J\", \"}\"]</div><hr></div><div>{date:%Y.%m.%d}</div><br>\n\n"
}]

The full deviantart extractor setting will look like this:

        "deviantart":
        {
            "directory": ["[gallery-dl]", "[{category}] {author[username]}"],
            "filename": "[{category}] {author[username]}—{index}—{date:%Y.%m.%d}—{title}.{extension}",
            "client-id": "12345",
            "client-secret": "0123456789abcdef0123456789abcdef",
            "metadata": true,
            "postprocessors": [{
                "name": "metadata",
                "mode": "custom",
                "filename": "[{category}] {author[username]}—{index}—{date:%Y.%m.%d}—{title}.html",
                "directory": "metadata",
                "extension": "html",
                "format": "<h1 style='display: inline'><a href='{url}'>{title}</a></h1> by <a href='https://www.deviantart.com/{username}'>{author[username]}</a><div><br></div><div class='content'>{description}</div><br><div><hr><div class='tags'>[\"{tags:J\", \"}\"]</div><hr></div><div>{date:%Y.%m.%d}</div><br>\n\n"
            }]
        },

Don't forget to use your own client-id and client-secret.


This will create formatted HTML files with some data.

A few examples:

Example 1

image

Example 2

image


The simple extractor with only {description} data in a txt file: https://github.com/mikf/gallery-dl/blob/6fdcfa941c8cd339aab363f515b4e4ce20dc70f2/docs/gallery-dl-example.conf#L118-L127

K4sum1 commented 1 year ago

I did this, but I get this error when running gallery-dl [config][error] UnicodeDecodeError when loading 'C:\Users\Admin\gallery-dl.conf': 'utf-8' codec can't decode byte 0x97 in position 230: invalid start byte

rautamiekka commented 1 year ago

I did this, but I get this error when running gallery-dl [config][error] UnicodeDecodeError when loading 'C:\Users\Admin\gallery-dl.conf': 'utf-8' codec can't decode byte 0x97 in position 230: invalid start byte

Don't hijack threads.

It says near-precisely what's wrong.

K4sum1 commented 1 year ago

It's an issue with the config. I copy and pasted it, put in my id and secret, and I got that error. I can't find info about it, I don't know what to do. I post it in the only place that seems relevant.

AlttiRi commented 1 year ago

It says that you put in the config an "illegal" character (a wrong text) at position 230. I have no idea how you got this error. Copy-paste the text carefully without selecting anything else.

K4sum1 commented 1 year ago

It doesn't like the {index}—{date:%Y.%m.%d}—{title} part for both filenames

rautamiekka commented 1 year ago

It's the em-dash it doesn't like, which is likely GitHub's fault for converting a normal dash into, which is why one must always use the code tag for this stuff.

AlttiRi commented 1 year ago

GitHub is fine with "—"/ (it looks like a "-" because it is displayed with a monospace font, but it is still "—").

Code block just uses the monospace font with preserving of formatting by keeping multiple spaces: "     ". 

No magic. It's just CSS.

    ...plus the extra copy button at the right.

Most likely, he saved the text file with non UTF-8 encoding, then gallery-dl read non-utf-8 text as utf-8 text.