Closed cheese529 closed 1 year ago
Do you also have parent-metadata
set accordingly in "reddit": { }
?
Because you definitely need it..
"reddit":
{
"#": "only spawn child extractors for links to specific sites",
"whitelist": ["imgur", "redgifs", "gfycat"],
"#": "put files from child extractors into the reddit directory",
"parent-directory": true,
"#": "transfer metadata to any child extractor as '_reddit'",
"parent-metadata": "_reddit"
},
Pinging @cheese529
Did you test this again? Because it should be working as intended.
I do indeed have the parent metadata enabled. I will test this again after I come home from uni and see if it is working.
@Hrxn I can confirm this is still not working, in fact now it even refuses to straight up download content from redgifs, i see the link being passed but nothing downloaded. Here's an example link [NSFW] https://www.reddit.com/r/pawg/comments/16lqs4b/im_really_dragging_a_wagon_back_here/
$ gallery-dl https://www.reddit.com/r/pawg/comments/16lqs4b/im_really_dragging_a_wagon_back_here/
/tmp/_/redgifs/redgifs_klutzygrowingpuppy.mp4
$ gallery-dl -o parent-metadata=_reddit --filter "print(_reddit['title'])" https://www.reddit.com/r/pawg/comments/16lqs4b/im_really_dragging_a_wagon_back_here/
I’m really dragging a wagon back here
Same, also works on my machine ™️
PS D:\> python.exe $current_gallery_dl_master -o base-directory="." --verbose 'https://www.reddit.com/r/pawg/comments/16lqs4b/im_really_dragging_a_wagon_back_here/'
Debug : gallery-dl -> Version 1.26.0-dev
Debug : gallery-dl -> Python 3.11.5 - Windows-10-10.0.19045-SP0
Debug : gallery-dl -> requests 2.31.0 - urllib3 2.0.4
Debug : gallery-dl -> Configuration Files ['%USERPROFILE%\\gallery-dl.conf']
Debug : gallery-dl -> Starting DownloadJob for 'https://www.reddit.com/r/pawg/comments/16lqs4b/im_really_dragging_a_wagon_back_here/'
Debug : reddit -> Using RedditSubmissionExtractor for 'https://www.reddit.com/r/pawg/comments/16lqs4b/im_really_dragging_a_wagon_back_here/'
Debug : reddit -> Using custom API credentials (client-id pPax3*****************)
Info : reddit -> Refreshing private access token
Debug : urllib3.connectionpool -> Starting new HTTPS connection (1): www.reddit.com:443
Debug : urllib3.connectionpool -> https://www.reddit.com:443 "POST /api/v1/access_token HTTP/1.1" 200 775
Debug : reddit -> Sleeping 0.10 seconds (request)
Debug : urllib3.connectionpool -> Starting new HTTPS connection (1): oauth.reddit.com:443
Debug : urllib3.connectionpool -> https://oauth.reddit.com:443 "GET /comments/16lqs4b/.json?limit=0&raw_json=1 HTTP/1.1" 200 4426
Debug : reddit -> Using download archive 'E:\Home\Meta\gallery-dl\archive\gallery-dl.archive.reddit.db'
Debug : reddit -> Active postprocessor modules: [ClassifyPP]
Debug : redgifs -> Using RedgifsImageExtractor for 'https://v3.redgifs.com/watch/klutzygrowingpuppy'
Debug : cookies -> Extracting cookies from C:\Users\Hrxn\AppData\Local\Google\Chrome\User Data\Profile 4\Network\Cookies
Debug : cookies -> Found Local State file at 'C:\Users\Hrxn\AppData\Local\Google\Chrome\User Data\Local State'
Info : cookies -> Extracted 2847 cookies from Chrome
Debug : cookies -> Cookie version breakdown: {'v10': 2847, 'other': 0, 'unencrypted': 0}
Debug : urllib3.connectionpool -> Starting new HTTPS connection (1): api.redgifs.com:443
Debug : urllib3.connectionpool -> https://api.redgifs.com:443 "GET /v2/auth/temporary HTTP/1.1" 200 None
Debug : urllib3.connectionpool -> https://api.redgifs.com:443 "GET /v2/gifs/klutzygrowingpuppy HTTP/1.1" 200 None
Debug : redgifs -> Using download archive 'E:\Home\Meta\gallery-dl\archive\gallery-dl.archive.redgifs.db'
Debug : urllib3.connectionpool -> Starting new HTTPS connection (1): thumbs46.redgifs.com:443
Debug : urllib3.connectionpool -> https://thumbs46.redgifs.com:443 "GET /KlutzyGrowingPuppy.mp4?expires=1695170400&signature=v2:e50912752de870cc343c5bb33ac5405aed3ae744e95d65fec6a36d817f2218e9&for=2a00:6020:b314:8e00&hash=6163438793 HTTP/1.1" 200 6468716
.\Reddit\S\Pawg\Unsorted\+Clips\2023-09-18.I_m_really_dragging_a_wagon_back_here.lil-braids.Score=2815.Comments=15.16lqs4b.mp4
PS D:\>
Could you post a full --verbose
log?
@Hrxn Did some messing around with my config and I think I figured it out. I had to add the "whitelist": ["redgifs"],
option in order for it to download.(weird because it was not blacklisted). Without this option inside the config it would refuse to download.
Regarding the metadata not being passed, it is still a bug. I will post a verbose log in a few minutes along with a text file filled with links you can use to test. They are NSFW so please be cautious.
My Current Config: https://mega.nz/file/Ep8CgRzB#iQEzMeAd4RvMEBZaMZpjyk-nAQR04HC8sSPak7QFvP8 Link to Verbose Log: https://pastebin.pl/view/08e08569 Link of URLs to test: https://pastebin.pl/view/55bb2bb6
Please let me know if something is wrong with my config as well although I don't think so.
That also works ...
(I removed the whitelist
setting, by the way)
$ gallery-dl --config-ignore -c myconfig.json https://www.reddit.com/r/pawg/comments/16lqs4b/im_really_dragging_a_wagon_back_here/
/tmp/_/parent-test/reddit/pawg/redgifs/I’m really dragging a wagon back here - 2/tmp/_/parent-test/reddit/pawg/redgifs/I’m really dragging a wagon back here - 2023-09-18 16lqs4b.mp4
Maybe the settings from your second config file are somehow interfering?
[gallery-dl][debug] Configuration Files ['%APPDATA%\\gallery-dl\\config.json', 'C:\\Users\\mnoor\\Videos\\reddit cofig\\config1.json']
Why not put it on https://gist.github.com/ ?
Will there be some sort of substitute for the whitelist setting or would I have to use blacklist if I want to avoid downloading from certain sites? Good point about the settings from my second config, I'll test again with just 1 config. Also did you try any of the URLs I sent in the pastebin? None of them pass down metadata. And BTW Thank you for telling me about https://gist.github.com/. This is wonderful, I will be using this for everything now :)
Will there be some sort of substitute for the whitelist setting or would I have to use blacklist if I want to avoid downloading from certain sites?
Well, you said that you needed to add a whitelist to make it work in the first place, but for me it worked even without.
Also did you try any of the URLs I sent in the pastebin?
They all link to gfycat, so they don't work anymore now that the site no longer exists. At least for the moment (#4558).
This is not an issue with it not passing metadata.
They all link to gfycat, so they don't work anymore now that the site no longer exists. At least for the moment (#4558).
Alright I think that explains everything now, I'm sorry I should have clarified that I am still using this version https://github.com/mikf/gallery-dl/commit/28798594e8dd165909ffd6d44578d7a109aae2a0
So interesting enough some of those gfycat URLs still download, just with the native file naming configuration that gfycat and redgifs used (e.g., UnrulyScholarlyAmoeba). That is why I assumed this was an issue with it not passing metadata. I believe it might be possible to figure something out here like you mentioned in (#4558). I will look more into it.
Since #4558 is its own special case, I think this one here can be closed?
I am still using this version https://github.com/mikf/gallery-dl/commit/28798594e8dd165909ffd6d44578d7a109aae2a0
In this case, you might get this to work by adding "parent-metadata": true
to your gfycat
settings.
parent-metadata
only works for direct descendants/children, and for gfycat links that are actually hosted on redgifs, it goes reddit
-> gfycat
-> redgifs
and gfycat never passes its reddit metadata down to redgifs.
Alright @mikf so it seems that the metadata is now being passed down correctly now without using "parent-metadata": true
in my gfycat settings, seems so whatever you did in #4558 solved it.
Unfortunately this has also caused another issue, all media in the comments is now ignored due to duplicate filenames. How could we solve this to also download all that media? Would it be possible to somehow use the default settings in my config for media linked in the comments/posts with multiple media links?
Verbose Log: https://gist.github.com/cheese529/3a28388d1ac0156bcf0a4f67c8b44276
Despite having the following inside my config for redgifs
="filename": { "'_reddit' in locals()": "{_reddit[title]} {_reddit[date]:%Y-%m-%d} {_reddit[id]}.{extension}", "not locals().get('title')": "{filename}.{extension}"
the reddit metadata is not passed forward. All the files downloaded have just the redgifs filename.