mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.7k stars 953 forks source link

[Tumblr] Blogs who only accept a "https://www.tumblr.com/<user name>/<ID>/" fail to download #3628

Closed arisboch closed 1 year ago

arisboch commented 1 year ago

There are some Tumblr blogs, that won't accept a "https://\<user name>.tumblr.com/post/" url, but only a "https:\/\/www.tumblr.com/\<user name>/\<ID>/" url. An attempt to download from these blogs fails. One of thee blogs would be https://www.tumblr.com/suf-fering. Could you take a look at it, please?

mikf commented 1 year ago

I'll look into it, but you can also manually rewrite those URLs to put them into the format expected by gallery-dl: https://suf-fering.tumblr.com/

mikf commented 1 year ago

So I took a look, but I'm not really sure what the problem is supposed to be.

Both https://suf-fering.tumblr.com/ as well as https://www.tumblr.com/suf-fering work as a blog URL, and

https://www.tumblr.com/suf-fering/700951855904227328, https://www.tumblr.com/blog/suf-fering/700951855904227328, https://www.tumblr.com/blog/view/suf-fering/700951855904227328, https://suf-fering.tumblr.com/post/700951855904227328, and https://suf-fering.tumblr.com/image/700951855904227328 get accepted as post URLs.

arisboch commented 1 year ago

I have no idea what kinda sorcery is happening on my computers, but it doesn't work with either of these URLs, I'm gonna post the verbose output and the Tumblr section of the .conf file:

gallery-dl --verbose  https://www.tumblr.com/suf-fering/700951855904227328/oomelet-i-want-to-see-her-in-another-outfit
[gallery-dl][debug] Version 1.25.0-dev
[gallery-dl][debug] Python 3.9.2 - Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
[gallery-dl][debug] requests 2.28.2 - urllib3 1.26.14
[gallery-dl][debug] Configuration Files ['${HOME}/.gallery-dl.conf']
[gallery-dl][debug] Starting DownloadJob for 'https://www.tumblr.com/suf-fering/700951855904227328/oomelet-i-want-to-see-her-in-another-outfit'
[tumblr][debug] Using custom api_key authentication
[tumblr][debug] Using TumblrPostExtractor for 'https://www.tumblr.com/suf-fering/700951855904227328/oomelet-i-want-to-see-her-in-another-outfit'
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): api.tumblr.com:443
[urllib3.connectionpool][debug] https://api.tumblr.com:443 "GET /v2/blog/suf-fering.tumblr.com/posts?id=700951855904227328&offset=0&limit=50&reblog_info=true&api_key=[redacted] HTTP/1.1" 404 150
[tumblr][error] NotFoundError: Requested user or post could not be found
        "tumblr":
        {
            "avatar": false,
            "external": false,
            "inline": true,
            "posts": "all",
            "reblogs": true,
            "api-key": "[redacted]",
            "api-secret": "[redacted]",
            "filename": "tumblr {blog[name]} {id} {num:>02}.{extension}",
            "directory": ["tumblr {blog[name]} {id}"],
            "postprocessors": 
            [{
                "name": "metadata",
                "event": "post",
                "filename": "tumblr {blog[name]} {id} {num}.html",
                "mode": "custom",
                "format": "<meta charset='UTF-8'/>{body}<br><br>{caption}<br><br>{question}<br>{answer}<br>{tags!S}"
            }]
        },

Maybe I should create a new API key for gallery-dl? Is it possible, that another .conf setting is somehow affecting the retrieval? Is it possible, that the python environment is somehow borked?

arisboch commented 1 year ago

https://www.tumblr.com/hismajestythebeggar

That one also doesn't work (warning, NSFW) ;(

mikf commented 1 year ago

Found the problem: Those blogs are "dashboard-only".

You need all four OAuth tokens to access them:

You should see OAuth1.0 authentication instead of api-key authentication in the debug output when using all four tokens.

[tumblr][debug] Using custom OAuth1.0 authentication
arisboch commented 1 year ago

I tried that, got the following URL back:

http://localhost:6414/?oauth_token=[redacted]&oauth_verifier=[redacted]#_=_

The "oauth_token" from the URL is the "access-token", and the "oauth_verifier" is the "access-token-secret", right? I tried that, but it still doesn't work ;-(

[gallery-dl][debug] Version 1.25.0-dev
[gallery-dl][debug] Python 3.9.2 - Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
[gallery-dl][debug] requests 2.28.2 - urllib3 1.26.14
[gallery-dl][debug] Configuration Files ['${HOME}/.gallery-dl.conf']
[gallery-dl][debug] Starting DownloadJob for 'https://www.tumblr.com/suf-fering/700951855904227328/oomelet-i-want-to-see-her-in-another-outfit'
[tumblr][debug] Using custom OAuth1.0 authentication
[tumblr][debug] Using TumblrPostExtractor for 'https://www.tumblr.com/suf-fering/700951855904227328/oomelet-i-want-to-see-her-in-another-outfit'
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): api.tumblr.com:443
[urllib3.connectionpool][debug] https://api.tumblr.com:443 "GET /v2/blog/suf-fering.tumblr.com/posts?id=700951855904227328&offset=0&limit=50&reblog_info=true HTTP/1.1" 404 None
[tumblr][debug] {'meta': {'status': 404, 'msg': 'Not Found'}, 'response': [], 'errors': [{'title': 'Not Found', 'code': 4012, 'detail': 'This Tumblr is only viewable within the Tumblr dashboard'}]}
[tumblr][info] Run 'gallery-dl oauth:tumblr' to access dashboard-only blogs
[tumblr][error] AuthorizationError: This Tumblr is only viewable within the Tumblr dashboard

What am I doing wrong?

mikf commented 1 year ago

You didn't do anything wrong. I made a mistake in commit 70ce45d9659ac78bba64ae00542ed7535e4dcc43 which stopped any oauth procedure before it could finish and display the two access-token values.

oauth_token and oauth_verifier are intermediate values that are supposed to be send to Tumblr, but that never happened due to this bug. Fixed in https://github.com/mikf/gallery-dl/commit/75570ad3f1c61085b0f0ea8df9d1e0637fa7d663, by the way.

arisboch commented 1 year ago

@mikf Thanks a lot, now all works perfectly!