vladkens / twscrape

2024! X / Twitter API scrapper with authorization support. Allows you to scrape search results, User's profiles (followers/following), Tweets (favoriters/retweeters) and more.
https://pypi.org/project/twscrape/
MIT License
951 stars 121 forks source link

Some questions from newbies using GitHub, about twitter API, cookies, code and adding a Twitter account. #133

Closed lengquanqu closed 5 months ago

lengquanqu commented 7 months ago

Hi vladkens, this is the first time I've used GitHub tool to crawl Twitter data for my research, I've already installed twscrape, but I'm having trouble with the 'Add Twitter account' section. You seem to offered two ways to add a Twitter account (via CLI and Python, respectively). Probably because I haven't used GitHub before, I'm a little confused about where to start with the "add Twitter account" section, because your repository is interlaced with CLI instructions and Python code, which makes me confused. I have some exact questions as follows:

  1. Is the readme.md section enough to simply crawl Twitter data?

  2. It seems to be mentioned in your repository that twscrape needs to use the Twitter api. I've already obtained my api through the official developer platform of Twitter, but I haven't found any part of the Python code you've given to add the api.

  3. Do I need the Twitter api to add a Twitter account through the CLI? Your repository doesn't seem to give such commands.

  4. As for the Python code in the "Usage" section, I'm not sure what needs to be replaced with my own code, and I'm not sure if I need to add anything extra in parentheses. For example, do I need to put my Twitter api in parentheses around "api = API()"? For example, does "tweet_id" in "await api.tweet_details(tweet_id)" need to be replaced with my Twitter Id?

  5. I am confused about the cookies mentioned in your repository, because I found several cookie values in the developer options window in the browser (as shown below), which one should I choose?

Please forgive me for asking so many questions, but your answers are very important to me, and I really hope you can answer my questions. Thank you so much for your hard work!

微信图片_20240228220127
BFAGIT commented 6 months ago

Hello @lengquanqu, It seems you don't seem to know exactly what Github is. To put it very simply, Github is an online repository to keep track of the changes to make to your code. It relies on GIT. You should look it up a bit more as it will greatly help you in your developement adventure. You run nothing from github. You need to download the code to you local machine and execute it from there. The best way to do it is to use the package manager pip that exist in your python enviroment.

For you to run the twscrape library you will have to have a python envirment setup and the twscrape library and all its dependencies installed. Once that is done you will be able to run the examples that are in the readme but of course you will have to provide your own accounts.

1) Yes the readme shows some very simple data crawling with for instance the api.search() method.

2) The library uses the public twitter api that is used in the backgroung when you go on the twitter website. that the beauty of this librabry no need to pay the private/paid API from twitter to get data

3) No since there are no API keys needed

4) The only thing you need to provide is a an account that you have created on your side, with the email, email password, username, password of that account.

5) Cookies are optional and is an other login method that is more complexe when you don't know a lot about howx a web page behaves. you should first try with the standard login method

Here is a very basic script extracted from the readme. You need to specify the element inadd_accoutn with your own stuff

from twscrape import API, gather
from twscrape.logger import set_log_level

async def main():
    api = API()  

    # ADD ACCOUNTS 
    await api.pool.add_account("user1", "pass1", "u1@example.com", "mail_pass1")

    # LOGIN TO ACCOUNT 
    await api.pool.login_all()

    # SEARCH TWEETS ABOUT ELON MUSK 
    tweets = await gather(api.search("elon musk", limit=20))  # list[Tweet]

    # PRINT TWEETS 
    for tweet in tweets:
        print(tweet)

if __name__ == "__main__":
    asyncio.run(main())

Hope it helps, KR

github-actions[bot] commented 5 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 5 months ago

This issue was closed because it has been stalled for 5 days with no activity.