bbolli / tumblr-utils

Utilities for dealing with Tumblr blogs, Tumblr backup
GNU General Public License v3.0
668 stars 124 forks source link

youtube-dl requests do not work in the EU/EEA #132

Closed nightpool closed 5 years ago

nightpool commented 5 years ago

log:

nightpool@neo:~/tumblr$ tumblr-utils/tumblr_backup.py --save-video --save-audio -j nightpool
HTTP Error 404: Not Found downloading https://66.media.tumblr.com/avatar_503751d2df9a_1280.pnj
WARNING: Could not send HEAD request to https://www.tumblr.com/privacy/consent?redirect=http%3A%2F%2Fnightpool.tumblr.com%2Fpost%2F179983473830%2F: HTTP Error 404: Not Found
WARNING: Falling back on generic information extractor.
WARNING: URL could be a direct video link, returning it as such.
Unable to download video in post #179983473830
WARNING: Could not send HEAD request to https://www.tumblr.com/privacy/consent?redirect=http%3A%2F%2Fnightpool.tumblr.com%2Fpost%2F177891560995%2F: HTTP Error 404: Not Found

Should be pretty self-explanatory. I believe there's a cookie that the consent process sets, we need to replicate that for all requests.

cebtenzzre commented 5 years ago

Also, with new enough youtube-dl, this error appears frequently on NSFW blogs:

ERROR: This Tumblr may contain sensitive media. Disable safe mode in your account settings at https://www.tumblr.com/settings/account#safe_mode

This could be solved by using the cookiefile parameter. I have actually done so locally but it uses a hardcoded path.

Doty1154 commented 5 years ago

@Cebtenzzre How did you specify to use a cookie file to youtube-dl to the python script? Environment variable or?

cebtenzzre commented 5 years ago
diff --git a/tumblr_backup.py b/tumblr_backup.py
index 61547b3..fca3d18 100755
--- a/tumblr_backup.py
+++ b/tumblr_backup.py
@@ -722,7 +722,8 @@ class TumblrPost:
             'nooverwrites': True,
             'retries': 3000,           
             'fragment_retries': 3000,
-            'ignoreerrors': True
+            'ignoreerrors': True,
+            'cookiefile': '/home/cebtenzzre/.local/share/tumblr-utils/cookies.txt'
         })
         ydl.add_default_info_extractors()
         try:

I got the cookies.txt using the cookies.txt extension and added # Netscape HTTP Cookie File before the first line (youtube-dl is picky).

seville24 commented 5 years ago
diff --git a/tumblr_backup.py b/tumblr_backup.py
index 61547b3..fca3d18 100755
--- a/tumblr_backup.py
+++ b/tumblr_backup.py
@@ -722,7 +722,8 @@ class TumblrPost:
             'nooverwrites': True,
             'retries': 3000,           
             'fragment_retries': 3000,
-            'ignoreerrors': True
+            'ignoreerrors': True,
+            'cookiefile': '/home/cebtenzzre/.local/share/tumblr-utils/cookies.txt'
         })
         ydl.add_default_info_extractors()
         try:

I got the cookies.txt using the cookies.txt extension and added # Netscape HTTP Cookie File before the first line (youtube-dl is picky).

Would you be able to explain where to add the cookies for youtube-dl to read from using the .py script? I'm having trouble understanding where I should add the code block you posted. Thanks in advance! I'm on Windows, by the way.

cebtenzzre commented 5 years ago

@seville24 That's a diff, which means you can apply it automatically if you have access to GNU patch (on Windows you can get it as part of GnuWin32, Ctrl+F for "patch"). It's a command line utility, so you'd have to Google how to use it. You can also apply it manually: The --- and +++ tell you which files to patch, 722 is a line number for the first displayed line, - means remove the line, and + means add the line. Since you're on Windows you can't use the exact cookiefile path that I used, so replace it with the path to wherever you put your cookie file. In other words, replace the text from /home to /cookies.txt with something else beginning with a drive letter and ending in cookies.txt.

AwfulBear commented 5 years ago

@seville24 im on windows and couldn't get the patch program to work.

I just edited the tumblr_backup.py and added the changed info

do a search for: 'ignoreerrors': True

add a comma at the end of that line then add the following on a new line below it

'cookiefile': 'c:\whatever directory\cookies.txt'

should look like this

        'ignoreerrors': True,
        'cookiefile': 'C:\whatever directory\cookies.txt'
seville24 commented 5 years ago

@AwfulBear This made it work, correct?

AwfulBear commented 5 years ago

@seville24 it work and it did download all my files now there were quite a few that had 403 errors but that is because they have already been deleted by tumblr.

Total file size came out to be 45gb for 28k Posts

I’ll work on my liked section next from the mentioned hack

saiforigis commented 5 years ago

I'm still getting the "This Tumblr may contain sensitive media" warning. The only thing I can figure is that the cookie file is wrong. Is there a line in particular that is supposed to be there. Looking over the cookie it is only a few lines and none of them are about safe mode.

cebtenzzre commented 5 years ago

@saiforigis Did you export cookies after logging into Tumblr, on the Tumblr website? Another thing you could try is pr #189, which adds a --cookies option since it was needed for --save-notes.

saiforigis commented 5 years ago

Yes. I even logged out and back in again and safe mode is off. Visiting the posts they all seem to be videos embedded from instagram I wonder if that is the issue. If you want to test the page is bimbocandy (NSFW obviously)

cebtenzzre commented 5 years ago

@saiforigis Downloading videos from that blog seems to work fine for me. How did you install the youtube_dl module? Do you know what version you have? Mine is 2018.12.9 (found using python2 -m pip show youtube_dl)

saiforigis commented 5 years ago

I pointed the cmd to directory c:\Python27\scripts and then "pip install youtube-dl". My version shows as 2018.12.9

I'm pretty sure I did everything right but maybe I made a simple mistake somewhere. https://imgur.com/a/0h83RAv

Thanks for the help.