mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.7k stars 953 forks source link

[Twitter] Media download failed with 404, possibly caused by a new regional restriction implemented by Twitter #6298

Closed andykamezou closed 1 week ago

andykamezou commented 2 weeks ago

Since last week I can't download any media.

[gallery-dl][debug] Version 1.27.5
[gallery-dl][debug] Python 3.12.4 - Windows-10-10.0.19045-SP0
[gallery-dl][debug] requests 2.32.3 - urllib3 2.2.2
[gallery-dl][debug] Configuration Files ['twitter.conf']
[gallery-dl][debug] Starting DownloadJob for 'https://x.com/search?q=from:{redacted} since:2024-09-23 until:2024-10-03&src=typed_query&f=live'
[twitter][debug] Using TwitterSearchExtractor for 'https://x.com/search?q=from:{redacted} since:2024-09-23 until:2024-10-03&src=typed_query&f=live'
[twitter][debug] Loading cookies from './twitter.com_cookies.txt'
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): x.com:443
[urllib3.connectionpool][debug] https://x.com:443 "GET /i/api/graphql/k5XapwcSikNsEsILW5FvgA/UserByScreenName?variables=%7B%22screen_name%22%3A%22{redacted}%22%2C%22withSafetyModeUserFields%22%3Atrue%7D&features=%7B%22hidden_profile_likes_enabled%22%3Atrue%2C%22hidden_profile_subscriptions_enabled%22%3Atrue%2C%22responsive_web_graphql_exclude_directive_enabled%22%3Atrue%2C%22verified_phone_label_enabled%22%3Afalse%2C%22highlights_tweets_tab_ui_enabled%22%3Atrue%2C%22responsive_web_twitter_article_notes_tab_enabled%22%3Atrue%2C%22creator_subscriptions_tweet_preview_api_enabled%22%3Atrue%2C%22responsive_web_graphql_skip_user_profile_image_extensions_enabled%22%3Afalse%2C%22responsive_web_graphql_timeline_navigation_enabled%22%3Atrue%2C%22subscriptions_verification_info_is_identity_verified_enabled%22%3Atrue%2C%22subscriptions_verification_info_verified_since_enabled%22%3Atrue%7D HTTP/11" 200 1098
[twitter][debug] Sleeping 3.00 seconds (request)
[urllib3.connectionpool][debug] https://x.com:443 "GET /i/api/graphql/fZK7JipRHWtiZsTodhsTfQ/SearchTimeline?variables=%7B%22rawQuery%22%3A%22from%3A{redacted}+since%3A2024-09-23+until%3A2024-10-03%22%2C%22count%22%3A100%2C%22querySource%22%3A%22%22%2C%22product%22%3A%22Latest%22%7D&features=%7B%22responsive_web_graphql_exclude_directive_enabled%22%3Atrue%2C%22verified_phone_label_enabled%22%3Afalse%2C%22creator_subscriptions_tweet_preview_api_enabled%22%3Atrue%2C%22responsive_web_graphql_timeline_navigation_enabled%22%3Atrue%2C%22responsive_web_graphql_skip_user_profile_image_extensions_enabled%22%3Afalse%2C%22c9s_tweet_anatomy_moderator_badge_enabled%22%3Atrue%2C%22tweetypie_unmention_optimization_enabled%22%3Atrue%2C%22responsive_web_edit_tweet_api_enabled%22%3Atrue%2C%22graphql_is_translatable_rweb_tweet_is_translatable_enabled%22%3Atrue%2C%22view_counts_everywhere_api_enabled%22%3Atrue%2C%22longform_notetweets_consumption_enabled%22%3Atrue%2C%22responsive_web_twitter_article_tweet_consumption_enabled%22%3Atrue%2C%22tweet_awards_web_tipping_enabled%22%3Afalse%2C%22freedom_of_speech_not_reach_fetch_enabled%22%3Atrue%2C%22standardized_nudges_misinfo%22%3Atrue%2C%22tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled%22%3Atrue%2C%22rweb_video_timestamps_enabled%22%3Atrue%2C%22longform_notetweets_rich_text_read_enabled%22%3Atrue%2C%22longform_notetweets_inline_media_enabled%22%3Atrue%2C%22responsive_web_media_download_video_enabled%22%3Atrue%2C%22responsive_web_enhance_cards_enabled%22%3Afalse%7D HTTP/11" 200 15076
[twitter][debug] Active postprocessor modules: [MetadataPP]
[twitter][debug] Sleeping 1.00 seconds (download)
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): pbs.twimg.com:443
[urllib3.connectionpool][debug] https://pbs.twimg.com:443 "GET /media/********?format=jpg&name=orig HTTP/11" 404 345
[downloader.http][warning] '404 Not Found' for 'https://pbs.twimg.com/media/********?format=jpg&name=orig'
[download][info] Trying fallback URL #1
[urllib3.connectionpool][debug] Resetting dropped connection: pbs.twimg.com
[urllib3.connectionpool][debug] https://pbs.twimg.com:443 "GET /media/********?format=jpg&name=4096x4096 HTTP/11" 404 345
[downloader.http][warning] '404 Not Found' for 'https://pbs.twimg.com/media/********?format=jpg&name=4096x4096'
[download][info] Trying fallback URL #2
[urllib3.connectionpool][debug] Resetting dropped connection: pbs.twimg.com
[urllib3.connectionpool][debug] https://pbs.twimg.com:443 "GET /media/********?format=jpg&name=large HTTP/11" 404 345
[downloader.http][warning] '404 Not Found' for 'https://pbs.twimg.com/media/********?format=jpg&name=large'
[download][info] Trying fallback URL #3
[urllib3.connectionpool][debug] Resetting dropped connection: pbs.twimg.com
[urllib3.connectionpool][debug] https://pbs.twimg.com:443 "GET /media/********?format=jpg&name=medium HTTP/11" 404 345
[downloader.http][warning] '404 Not Found' for 'https://pbs.twimg.com/media/********?format=jpg&name=medium'
[download][info] Trying fallback URL #4
[urllib3.connectionpool][debug] Resetting dropped connection: pbs.twimg.com
[urllib3.connectionpool][debug] https://pbs.twimg.com:443 "GET /media/********?format=jpg&name=small HTTP/11" 404 345
[downloader.http][warning] '404 Not Found' for 'https://pbs.twimg.com/media/********?format=jpg&name=small'
[download][error] Failed to download ...

Session authenticated using --cookies.

There is no problem when viewing the media url through browser (logged-in).

Perhaps some kind of referral or token is needed to access the media url, because the behaviour seems similar to when opening the media link unauthenticated in incognito: 20241009 132101

Originally posted by @andykamezou in https://github.com/mikf/gallery-dl/issues/6244#issuecomment-2389248659

andykamezou commented 2 weeks ago

I did some tests and discovered that the 404 error may be due to a new regional restriction by Twitter that occurs when attempting to access a media URL without logging in.

When using a VPN (Japan server), the media is loaded as it should even without logging in.

Luckily, I have a mini-PC connected to the VPN 24/7. I use it as a proxy and configure the Twitter extractor to connect via this proxy.

Let's hope this restriction will not be rolled out to another region.

mikf commented 2 weeks ago

without logging in

Neither gallery-dl nor Twitter itself send any authentication info (cookies, tokens, ...) when accessing media files, so your login status shouldn't matter here.

andykamezou commented 1 week ago

without logging in

Neither gallery-dl nor Twitter itself send any authentication info (cookies, tokens, ...) when accessing media files, so your login status shouldn't matter here.

Yeah, I'm aware that's how it usually is.

I've disabled my desktop Adguard, tried different browsers, and even curl get the 404 page. The more I test, the more I figured something wrong with my main PC connection to Twitter, but I can't pinpoint the cause.

And today evening, suddenly Twitter (only Twitter) site refused to load any images.

Perhaps it's time for a fresh Windows. Think I'll close this for now.