JustAnotherArchivist / snscrape

A social networking service scraper in Python
GNU General Public License v3.0
4.39k stars 702 forks source link

Support for Twitter replies #51

Closed JustAnotherArchivist closed 3 years ago

JustAnotherArchivist commented 5 years ago

Similar to #12 but in the opposite direction: start with a certain tweet, then fetch all of its replies, recursively.

JustAnotherArchivist commented 5 years ago

There should be options for limiting recursion depth and only recursing on the top N replies to each tweet. The code also needs to avoid unnecessary requests by checking the reply counter on each tweet discovered; tweets without replies can be skipped entirely, and tweets with one reply can be skipped if that reply is also in the stream already.

dansar39 commented 3 years ago

Hola: saben si con snscrape se puede encontrar el texto del twit y no sólo la dirección? Se puede encontrar los seguidores de un usuario? Gracias

p-dre commented 3 years ago

As beginner tried to write a selenium crawler for the answers. Unfortunately, it was far too inefficient. The tweet ids of the answers can be found as follows. After you opened the relevant tweet.

tweet = driver.find_elements_by_css_selector("div[data-testid='tweet']") tweet_id = tweet.find_element_by_css_selector("a[href*='status']") tweet_id = tweet_id.get_attribute('href').split("/")[-1]

maybe that is helpful for somebody who is able to scroll down the complete page and then use an html parser