mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.7k stars 953 forks source link

[twitter] only download first page #2514

Closed afterdelight closed 1 year ago

afterdelight commented 2 years ago

twitter with adult account only download first 25 photos. pls fix

Hrxn commented 2 years ago

I just downloaded thousands of pics, like, yesterday?

Fair to assume you're doing something wrong.. Steps to reproduce? gallery-dl verbose log? Link that's causing problems?

AlttiRi commented 2 years ago

The first 25 images, or 25 images total?

With the "recent" Twitter changes it now requires to use an account to download NSFW media.

You need to use the "auth_token" cookie from the browser where you are logged in (or "username" and "password") in the config file:

        "twitter": {
            "cookies": {
                "auth_token": "XXXABCDEFXXX"
            }
        }

Take a look at the similar issues.

afterdelight commented 2 years ago

Okay after changing the url to https://twitter.com/vaha0121w4/media it download 1146 photos while the account has about 66000 tweets. source: https://twisave.com/vaha0121w4

How to download all medias from the account excluding retweets?

AlttiRi commented 2 years ago

/username and /username/media is limited by ~1000 last posts as far I know.

Download the result of "the search page" https://twitter.com/search-advanced.

For example: gallery-dl "https://twitter.com/search?q=from:username".


Everything: https://twitter.com/search?q=from:vaha0121w4

~From 2014-03-03 to 2018-08-15:~ ~https://twitter.com/search?q=from:vaha0121w4+since:2014-03-03+until:2018-08-15~ It does not seem to work in gallery-dl. I get [twitter][info] No results for ....


Also it looks that &src=typed_query is not needed for gallery-dl, while Twitter shows only a part without it. Just to compare: https://twitter.com/search?q=from:vaha0121w4&src=typed_query ~https://twitter.com/search?q=from:vaha0121w4+since:2014-03-03+until:2018-08-15&f=image&src=typed_query~

afterdelight commented 2 years ago

okay, now its downloading more pics. we will see how many it got

afterdelight commented 2 years ago

will 'https://twitter.com/search?q=from:vaha0121w4' sufficient to download all images and videos?

AlttiRi commented 2 years ago

If you use an account credentials, then it should do.

You can run additionally /username/media after /search?q=from:username just in case.

afterdelight commented 2 years ago

okay, does https://twitter.com/search?q=from:vaha0121w4 include download retweets?

Hrxn commented 2 years ago

Yes, if you set the "retweets" option accordingly.,

AlttiRi commented 2 years ago

Just checked:

/search?q=from:username does not includes retweets. Even if "retweets" is used (true/"original"). It works only when you download /username endpoint.

Also with "retweets": "original" the filename will use the original tweet id and author name, instead of the reposter's ones.

(With false value (by default) it does not download retweets at all from /username endpoint.)

AlttiRi commented 2 years ago

The interested thing is that /username (timeline) has some posts missed even within the first ~1000 posts.

For example, I have run gallery-dl https://twitter.com/DPMaker_ (mostly NSFW 3D)

Then I run again gallery-dl https://twitter.com/DPMaker_/media:

image

It (the first run) has gaps even for the recent posts.


For example, it displays in /media: https://twitter.com/DPMaker_/status/1512576718120255492

However, in timeline I only see the artist's two reply to this post. But not the main post with the image.

UPD: Note: this post is pinned.

UPD2: This is a media reply https://twitter.com/DPMaker_/status/1488751448624013312 (also is missed in timeline)


So, I think, if you don't interested in retweets, don't use /username at all. Use only /username/media and /search?q=from:username for the large count of tweets.

UPD3: Timeline does not includes media from replies, a pinned post (at least with "retweets": false), the posts are out of the ~1000 recent (while whey can be in /media), ...something more?

UPD4: Also I did not tested "text-tweets": true, + postprocessors with /media.

UPD5: /media does not includes "cards" ("cards": true,).

afterdelight commented 2 years ago

That's nice to know. Btw you have a user script to display total media posts on twitter user page and if possible all total tweets too.

afterdelight commented 2 years ago

Also I want to know how to seperate downloaded retweets to subfolder using '/search?q=from:username' and 'retweets": "original'

example: E://main-download-folder/retweets/downloaded-pics

nisehime commented 2 years ago

Timeline does not includes media from replies, a pinned post (at least with "retweets": false), the posts are out of the ~1000 recent (while whey can be in /media), ...something more?

  1. Have you set the pinned option to true?
  2. To download timeline with replies you should use appropriate link. For your example: https://twitter.com/DPMaker_/with_replies. However, with replies you will also get potentially unneeded media, which are neither target user's tweets, nor his retweets (so tweets he replied to). I explained that here. Unfortunately, my suggested fix wasn't done.

/search?q=from:username does not includes retweets.

You should put include:nativeretweets in the search query. You can also put filter:media or filter:links, to speed up the process in exchange to accuracy.

It does not seem to work in gallery-dl. I get [twitter][info] No results for ....

Works for me. Twitter's search is not consistent. It doesn't return all results, very random and sometimes decides to not return anything older than X. Nothing really can be done about that other than trying again over some time, Hitomi Downloader dev suggested that trying different user-agents may help.

will 'https://twitter.com/search?q=from:vaha0121w4' sufficient to download all images and videos?

No. You should always do timeline/media and then search.

afterdelight commented 2 years ago

i only want the user's medias both in his timeline and his replies. not other users. so to use with media filter, i put it like 'https://twitter.com/search?q=from:vaha0121w4&filter:media' ?

Also can you answer both of my questions on top of your reply? thanks

nisehime commented 2 years ago

i only want the user's medias both in his timeline and his replies.

I suggest using /media first, then continue with search. You can copy the id of the last downloaded tweet and put it in the search query with max_id:{id} parameter. For example, the last id from /media page of your account is 1335006214657196032, so the search link would be: https://twitter.com/search?q=from:vaha0121w4 max_id:1335006214657196032 filter:media However, as I type, it didn't work (works in Hitomi btw). Not because I do something wrong, it's just twitter search, as I explained, is inconsistent. Removing filter:media or using filter:links (basically the same as media but also includes tweets with links and quoted tweets) worked though in this case specifically.

Anyway, twitter's search is a pain. It never gives you all tweets and requires various approaches to work with. Use issues search on this rep, it was discussed many times.

you have a user script to display total media posts on twitter user page and if possible all total tweets too

Display where? In a browser? It's already displayed there, no scripts needed. Gallery-dl also have that info in the metadata (see -K).

AlttiRi commented 2 years ago

i put it like 'https://twitter.com/search?q=from:vaha0121w4&filter:media' ?

Not /search?filter:media&q=from:username, but /search?q=filter:media+from:username. It's the one query param "q" with two "options" separated by a space. "+" (or "%20") sign is an alias for "`" (space) character in a URL. Using a space (" ") as is would require to quote the URL (as well as when there is "&`" character).

afterdelight commented 2 years ago

okay thanks for the clarfication, i got [twitter][info] No results for https://twitter.com/search?q=filter:media+from:vaha0121w4 but i got success with https://twitter.com/search?filter:media&q=from:vaha0121w4

afterdelight commented 2 years ago

also i dont have any info about tweets numbers or media numbers on the username display page tho

AlttiRi commented 2 years ago

okay thanks for the clarfication, i got [twitter][info] No results for https://twitter.com/search?q=filter:media+from:vaha0121w4 but i got success with https://twitter.com/search?filter:media&q=from:vaha0121w4

Because of it's the same thing as https://twitter.com/search?q=from:vaha0121w4. Wrong using of a search param (?q=from:vaha0121w4&filter:media=), or using of a fictional param (filter:media) has no effect.

Currently /search?q=filter:media+from:username does not work. 8 hours ago it worked fine.

AlttiRi commented 2 years ago

Screenshot

Gallery-dl also have that info in the metadata (see -K).

image

It's count of posts with media.

afterdelight commented 2 years ago

thanks now i can see the number. Now i need to know how to download profile picture and profile banner to sub folders.

afterdelight commented 2 years ago

hey, how to get medias included in replies and retweets? whats the command? ty

afterdelight commented 2 years ago

i got it https://twitter.com/search?q=from:vaha0121w4+include:nativeretweets&f=live