vladkens / twscrape

2024! X / Twitter API scrapper with authorization support. Allows you to scrape search results, User's profiles (followers/following), Tweets (favoriters/retweeters) and more.
https://pypi.org/project/twscrape/
MIT License
1.12k stars 133 forks source link

Question: How to get authenticated user info? #168

Closed guntutur closed 4 months ago

guntutur commented 7 months ago

i am using pool login for multiple account to get user_tweets

await api.pool.add_account("user1", "pass", "mail1@gmail.com", "")
await api.pool.add_account("user2", "pass", "mail2@ymail.com", "")
await api.pool.login_all()

async for tweet in api.user_tweets(target_id):

i just want to know from which logged in user (user1 or user2) the particular user tweet is acquired

andylolz commented 7 months ago

@guntutur You can if you use api.user_tweets_raw. So:

from twscrape import API
from twscrape.models import parse_tweets

api = API()

await api.pool.add_account("user1", "pass", "mail1@gmail.com", "")
await api.pool.add_account("user2", "pass", "mail2@ymail.com", "")
await api.pool.login_all()

async for rep in api.user_tweets_raw(target_id):
    print(rep.__username)  # either user1 or user2
    for tweet in parse_tweets(rep.json()):
        ...
guntutur commented 7 months ago

@guntutur You can if you use api.user_tweets_raw. So:

from twscrape import API
from twscrape.models import parse_tweets

api = API()

await api.pool.add_account("user1", "pass", "mail1@gmail.com", "")
await api.pool.add_account("user2", "pass", "mail2@ymail.com", "")
await api.pool.login_all()

async for rep in api.user_tweets_raw(target_id):
    print(rep.__username)  # either user1 or user2
    for tweet in parse_tweets(rep.json()):
        ...

Thanks andy, will try, is there any particular reason why __username ref only available in user_teeets_raw method?

andylolz commented 7 months ago

@guntutur Good question. I don’t know the answer I’m afraid.

It’s possible it’s just set for debugging purposes, and later discarded.

It’s set just here: https://github.com/vladkens/twscrape/blob/00a8e07b43c1fbbea95566cc3aae95db76cd4ae3/twscrape/queue_client.py#L209

then it’s discarded when rep.json() is called: https://github.com/vladkens/twscrape/blob/00a8e07b43c1fbbea95566cc3aae95db76cd4ae3/twscrape/api.py#L327

vladkens commented 4 months ago

Hi. The design of twscrape was to operate accounts on an abstract level. So this information is not available in the public API of the library.

But any _raw method contains information about which account made the request in the field __username.

import asyncio

from twscrape import API, gather
from twscrape.models import parse_tweets

async def main():
    api = API()
    async for rep in api.search_raw("python", limit=10):
        print("Account used:", getattr(rep, "__username", None))
        for doc in parse_tweets(rep):
            print(doc.url)

if __name__ == "__main__":
    asyncio.run(main())