Closed ghost closed 1 year ago
What is causing the first animation in link 2 not to be downloaded?
The patreon-skip-file
option. (#1689, 48647480)
In all patreon posts on kemono that I've seen until now, it was always the main file that was a duplicate of another attachment file. but that doesn't seem to always hold true. (#1751)
Does gallery-dl distinguish between inline content, "files" content, and "attachment" content when downloading from a Patreon service on kemono.party?
There's a type
metadata field that is either "file"
, "attachment"
, or "inline"
.
Have I simply configured something wrong?
You haven't, it's just that any attempt of fixing this "duplicate files for patreon posts" issue has always failed, including the current "ignore main file if there are attachments".
BTW, for new files SHA-256 taken from the URL can be used to define are the files are same, or they just only have the same name.
Ah, I see. Thanks for clearing that up. I suppose I'll just have to download everything and manually remove duplicates, then.
The
patreon-skip-file
option. (#1689, 4864748) In all patreon posts on kemono that I've seen until now, it was always the main file that was a duplicate of another attachment file. but that doesn't seem to always hold true. (#1751)
Yeah. I think I made the issue that led to that option being included, actually. Heh.
Does gallery-dl distinguish between inline content, "files" content, and "attachment" content when downloading from a Patreon service on kemono.party?
There's a
type
metadata field that is either"file"
,"attachment"
, or"inline"
.
That's good to know. There may be something I use that for.
Have I simply configured something wrong? You haven't, it's just that any attempt of fixing this "duplicate files for patreon posts" issue has always failed, including the current "ignore main file if there are attachments".
Well, for what it's worth, the "ignore main file if there are attachments" approach does filter out the vast, vast majority of duplicates and it's mostly solved kemono's data duplication. I just seem to have found an artist or a post that happens to store data differently.
BTW, for new files SHA-256 taken from the URL can be used to define are the files are same, or they just only have the same name.
Is there a download comparison option in gallery-dl that does that? I've looked through some of the comparison options in the config documentation but I don't remember seeing something like that.
It's the new URL format introduced 4 days ago. Currently not all files uses it.
There are some cases where the images aren't posted in 'files' area, but 'content' area and the downloader skipped the content ones. The images aren't links, just inline.
@TestPolygon
Currently not all files uses it.
And they still do not, even more than a week later. Maybe these changes only got applied to patreon posts.
$ gallery-dl -g https://kemono.party/gumroad/user/trylsc/post/IURjT
https://kemono.party/data/files/gumroad/trylsc/IURjT/reward8.jpg
https://kemono.party/data/attachments/gumroad/trylsc/IURjT/$3.zip
@skyvory inline images are supposed to be supported, unless the URLs in newer posts got changed and aren't picked up by gallery-dl.
$ gallery-dl -g https://kemono.party/fanbox/user/7356311/post/802343
https://kemono.party/data/inline/fanbox/uaozO4Yga6ydkGIJFAQDixfE.jpeg
@mikf
For the particular artist that I wanted to download, another factor may be that the inline images are links to an outside source (Imgur) instead of being direct uploads to Kemono. I'm not exactly sure how Patreon allows creators to upload images to posts, but if we look at https://kemono.party/patreon/user/4577256/post/53013824
, and right click > view image/open image in new tab
, we stay on kemono.party.
For my artist, you can look at https://kemono.party/patreon/user/4577256/post/53013824
(mostly SFW, some minor nudity), and right click > view image/open image in new tab
, we are redirected to an Imgur page.
I'm not sure if this is something gallery-dl accounts for when crawling kemono patreon posts. From some minor testing, it doesn't seem to recognize that these embedded/inline images are even there.
In any event, the workaround that I'm using now is simple but somewhat tedious using JDownloader 2:
Not sure if this is the best place, apologies. But I noticed with this URL that the main attachment 404s but the inline image isn't available to download:
https://kemono.party/patreon/user/7453087/post/33060907
Not too sure how that differs from the one posted earlier, which does come through as an inline post. Most likely because it has both a file and an inline image?
https://kemono.party/fanbox/user/7356311/post/802343
@valdearg fixed in https://github.com/mikf/gallery-dl/commit/db857b40d8e813926db44d00f4c95ea4544812b8. The inline image URL there started with https://kemono.party/
instead of the expected /inline
.
You're amazing! Thanks, that's got it!
[This might look like a wall of text, but I don't think it's actually that much information. Thanks in advance.]
I am attempting to download some files from kemono.party, but the behaviour of the downloader seems inconsistent depending on whether the target post has its content uploaded as files or attachments, and which ones are duplicates (because of course that's still a problem on kemono.party). I am using
gallery-dl 1.18.4-dev
.Target URLs [no nudity, but NSFW]:
It might be worth noting that link 2 doesn't have any images listed under "content" on the page, but if you look at the image URLs you can see that the first image is under
hostname/files/etc
and the others arehostname/attachments/etc
The JSON for my gallery-dl config file:
I have configured it this way to force all Patreon attachment filenames to use underscores instead of spaces, which protects against duplicate files with slightly different filenames. It has worked for me for several months.
When using this config, I downloaded all images except for animation 1 from link 2, and there were no duplicates, but because of the filenames the order of each picture was jumbled. I tried to change the JSON to download everything and put them in the correct order:
This config improved the filenames to be in order, but it didn't download the missing picture from the first config and it downloaded the duplicate animation from link 2.
I tried to see what keywords/filters I could use in the filename by using
gallery-dl -K [link 2]
but that did not seem to help: according to gallery-dl, thenum
(index) of each picture in that link starts at 1 with the duplicate animations. Even when I remove the distinction between Patreon and other services (or removed thefilename
block entirely), gallery-dl does not download the first animation.In summary:
For reference, here is the command and verbose output when using the second config.