Open a-washing-machine opened 1 year ago
Short and incomplete answer for now: There is an is_original
metadata field (#4559) that can be used in conditional filenames / directories to distinguish between low/full res images.
"filename": {
"is_original": "filename",
"" : "LOW_RES filename"
}
Short and incomplete answer for now: There is an
is_original
metadata field (#4559) that can be used in conditional filenames / directories to distinguish between low/full res images."filename": { "is_original": "filename", "" : "LOW_RES filename" }
Where do I put this / what do I do with this in config.json
?
I tried
"deviantart": { "filename": { "is_original": "filename", "" : "LOW_RES filename" }, },
But that results in the image file being named filename
, and putting it in the postprocessors section
"deviantart": { "postprocessors": [ { "name": "metadata", "event": "post,skip", "filename": "{index}.json" } { "filename": { "is_original": "filename", "" : "LOW_RES filename" }, } ] },
gives [config][error] JSONDecodeError when loading '/home/specturion/.config/gallery-dl/config.json': Expecting ',' delimiter: line 121 column 17 (char 3108)
.
I looked at #4559 but it is unclear.
Well, it's not literally filename
- that was just an example.
The default filename is "{category}_{index}_{title}.{extension}"
So use this:
{
"extractor":
{
"deviantart":
{
"client-id": null,
"client-secret": null,
"refresh-token": null,
"auto-watch": false,
"auto-unwatch": false,
"comments": false,
"extra": false,
"flat": true,
"folders": false,
"group": true,
"include": "gallery",
"journals": "html",
"jwt": false,
"mature": true,
"metadata": false,
"original": true,
"pagination": "api",
"public": true,
"quality": 100,
"wait-min": 0,
"filename": {
"is_original": "{category}_{index}_{title}.{extension}",
"" : "LOW_RES_{category}_{index}_{title}.{extension}"
}
}
}
}
This works. I also had to replace is_original
with is_downloadable
to get it to work. Thank you for the help.
Short and incomplete answer for now: There is an
is_original
metadata field (#4559) that can be used in conditional filenames / directories to distinguish between low/full res images.
Is there any possibility for the return of the feature which enabled " downloading non-downloadable images in HQ" in DA? :')
@stillweebing Possible, yes. Likely? Not sure. It depends on dA, nothing that can be done on gallery-dl's side (Unless someone discovers some new, previously unknown workaround).
So, the above doesn't really work for me. Something can be still is_downloadable=false when it's not actually blocked from being downloaded. Like, that just seems to block the download button on the website, but doesn't interfere with Gallery-dl. is_original also suffers from the same sort of inconsistency.
What's interesting is that when you do a -K check on a specific image that's totally behind a paywall, there is a parameter called "tier_access" set to locked that you would think you could use conditionally. But when you do a -K check on an image that isn't behind a paywall, that parameter is completely omitted. This is relevant because trying to do the conditional logic with:
"directory": {
"tier_access == 'locked'": ["{category}","HAIDeviantArt","X_da-{author[username]}","Trash"],
"" : ["{category}","HAIDeviantArt","X_da-{author[username]}"]
}
results in: "Applying directory format string failed (NameError: name 'tier_access' is not defined)"
@mikf
Short and incomplete answer for now: There is an
is_original
metadata field (#4559) that can be used in conditional filenames / directories to distinguish between low/full res images."filename": { "is_original": "filename", "" : "LOW_RES filename" }
Neat. :)
Well, okay, I haven't gotten around to really test this yet --- but quick question, would this be currently possible somehow:
If "deviantart_123456789_Some Artwork Title.png" already exists, in that case DON'T download "deviantart123456789[LOW_RES_PREFIX]_Some Artwork Title.jpg". (also considering that the LOW_RES image might be a JPG, while the old HD image might be a PNG)
If not currently possible, I imagine it would create a ton of undesired clutter needlessly downloading low-res versions of images I already have in HD, and mess up the abort-parameter. :/
It would probably "only" be a problem the first time I re-parse with the current gallery-dl version (I haven't done a reparse since October!), but cleaning up the clutter would take A LOT of manual work afterwards. :(
So it'd be great if there was some way of preventing clutter in the first place. ^_^;;
...soooo I take it that's not currently possible?
@Corrupt-Specturion
This works. I also had to replace
is_original
withis_downloadable
to get it to work. Thank you for the help.
@sbobbo
So, the above doesn't really work for me. Something can be still is_downloadable=false when it's not actually blocked from being downloaded.
Hmm. For me, is_original
works better than is_downloadable
.
is_downloadable
produces plenty of false-positives, some of which aren't even image-files (e.g. html submissions).
Comparing an older copy (August 2023) from of my "benchmark galleries" with ~1850 submissions against a fresh download with the LOWRES prefix enabled, I found that is_original
correctly marked all and only those images where width/height had changed.
(I've got a tool to recursively compare two given folders with images for changes in image-dimensions and/or file-size between the images in said folders. Very useful to check for site changes. ;-)
OR: If HD image cannot be downloaded with current means, add suffix to filename to avoid filename collision should HD image download become possible again in the future
Regarding deviantArt closing the loophole that allowed downloading "non-downloadable" images in high resolution;
I'd like to request the option to, for all files where full resolution downloads are currently impossible due to recent changes made by DA, to have gallery-dl add a suffix to the filename to indicate the file is a low resolution fallback download.
Something like "deviantart_123456789_Some Artwork Title_LOW_RES_SUFFIX.jpg" that is easily findable. Optional via config file, I suppose.
This way, should full res downloads somehow become possible again at some point in the future --- even years from now --- there will be no filename collision between full res and low res downloads, and any missing HD images can be added into your existing download folder by simply re-parsing the galleries in your download queue without the abort-parameter.
I'd rather have the low res version of images on my hard drive right now and sort them out when a higher resolution image becomes available later, then to risk artworks being deleted before I can download them.
But I also want to avoid being unable to tell low-res and full res files apart, like what happened to me here: https://github.com/mikf/gallery-dl/issues/2846
( ...I never got around to doing that full reparse, it just would have taken up too much space AND taken too long. -_- )
Let me say it this way: I've already got tools I can use to sort out "which of these two images with the same deviantArt ID has the higher image resolution", and do it recursively for a folder-hierarchy ... as long as I °have° both low res and high res images on my hard drive, and any presumed "lower resolution" images have a consistent, fixed, easily findable filename suffix that isn't likely to cause false positives with any artwork titles. ;)
Furthermore, it would be sensible to prevent downloading the low res file if the full res file is already on the system.
As pointed out here https://github.com/mikf/gallery-dl/issues/4652#issuecomment-1773921752 , that's what's happening with the current version of gallery-dl, as low-res fallback preview images are often (but not always!) JPG files and the full resolution images already downloaded prior may not be. This does lead to unwanted clutter.
Now I don't mind too much if some images get downloaded twice if an artist changes an artwork's title and thus avoids filename collision with existing files, that doesn't seem to happen as much as you'd think.
But it would be sensible to include a clause "if (downloading a low-res fallback preview image) => check if file of same filename (without the added suffix!) but different file extension (JPG, PNG, GIF) already exists." If yes, skip.
(The exception to this is if the file on the hard drive is a non-image file, then sure, by all means do download the preview file too, I'm all for downloading both in non-image cases.)
In short:
If "deviantart_123456789_Some Artwork Title.png" already exists, DON'T download "deviantart_123456789_Some Artwork Title_LOW_RES_SUFFIX.jpg".
But if "deviantart_123456789_Some Artwork Title_LOW_RES_SUFFIX.jpg" exists, it SHOULD download "deviantart_123456789_Some Artwork Title.png" in the future should high res download become possible again.
Apologies for being a bit wordy, can't think of how to compress it further. :/