JustAnotherArchivist / snscrape

A social networking service scraper in Python
GNU General Public License v3.0
4.39k stars 702 forks source link

Support for polling for new posts on Twitter #148

Closed JustAnotherArchivist closed 3 years ago

JustAnotherArchivist commented 3 years ago

On search and profile pages, when you're scrolled to the top, it regularly polls for new posts that are then inserted at the top of the feed. It would be nice to support this and get a stream of new search results or user tweets.

JustAnotherArchivist commented 3 years ago

I don't think I'll add explicit support for this, but it's already possible with minimal effort by creating a scraper instance and repeatedly calling get_items while keeping track of the highest tweet ID already encountered. For example, the following will print new tweets returned by the search for foo in near real-time and chronological order as JSONL (similar to snscrape --jsonl twitter-search foo):

import snscrape.modules.twitter
import time

s = snscrape.modules.twitter.TwitterSearchScraper('foo')
maxSeen = None
while True:
    newTweets = []
    for i, t in enumerate(s.get_items()):
        if maxSeen is None:
            maxSeen = t.id
            break
        if t.id <= maxSeen:
            break
        newTweets.append(t)
    if newTweets:
        maxSeen = max(maxSeen, newTweets[0].id)
    for tweet in reversed(newTweets):
        print(tweet.json())
    time.sleep(5)