Closed JustAnotherArchivist closed 3 years ago
There should be options for limiting recursion depth and only recursing on the top N
replies to each tweet. The code also needs to avoid unnecessary requests by checking the reply counter on each tweet discovered; tweets without replies can be skipped entirely, and tweets with one reply can be skipped if that reply is also in the stream already.
Hola: saben si con snscrape se puede encontrar el texto del twit y no sólo la dirección? Se puede encontrar los seguidores de un usuario? Gracias
As beginner tried to write a selenium crawler for the answers. Unfortunately, it was far too inefficient. The tweet ids of the answers can be found as follows. After you opened the relevant tweet.
tweet = driver.find_elements_by_css_selector("div[data-testid='tweet']") tweet_id = tweet.find_element_by_css_selector("a[href*='status']") tweet_id = tweet_id.get_attribute('href').split("/")[-1]
maybe that is helpful for somebody who is able to scroll down the complete page and then use an html parser
Similar to #12 but in the opposite direction: start with a certain tweet, then fetch all of its replies, recursively.