vladkens / twscrape

2024! X / Twitter API scrapper with authorization support. Allows you to scrape search results, User's profiles (followers/following), Tweets (favoriters/retweeters) and more.
MIT License
951 stars 121 forks source link

Ability to manual toggle between accounts in .db #138

Open Rouge-Trader opened 6 months ago

Rouge-Trader commented 6 months ago

I could be missing something, but don't believe there is the ability to chose which account sends a certain request. For my specific functionality I want one of my accounts to read my timeline while a different account is checking HashTags. Seems like it could be a useful feature to add, I'd be happy to contribute if i could get some pointers. Thanks

davinkevin commented 6 months ago

+1 with also round-robin to prevent rate limiting

BonifacioCalindoro commented 6 months ago

That feature is curcial for debugging accounts and trying to trace if there is a specific limit set in one of them (yes, there are some limits that twscrape is not aware of yet, for example trying to fetch tweet_details from a tweet_id of a shadowbanned account; some accounts fetch the info, some others fetch None, and i believe it's on twitter servers' side so we need a way to debug which account was used each time)

vladkens commented 5 months ago

Hi. Here a function get_for_queue in AccountPoll, so you can control which account you want to use, but you need to write SQL query for this.


import asyncio

from twscrape import API, AccountsPool

class MyPool(AccountsPool):
    def get_for_queue(self, queue: str):
        # for search timeline always use acc1
        if queue == "SearchTimeline":
            return self._get_and_lock(queue, "acc1")

        # for retweeters use acc2 or acc3
        if queue == "Retweeters":
            qs = "SELECT username FROM accounts WHERE username IN ('acc2', 'acc3') ORDER BY RANDOM() LIMIT 1"
            return self._get_and_lock(queue, qs)

        # for all other queries use the default method
        return super().get_for_queue(queue)

async def main():
    poll = MyPool()
    api = API(poll)

    async for tw in api.search("foo", limit=10):

if __name__ == "__main__":

@davinkevin For simple "round robin" possible to use random accounts order (no custom poll requried):

api.poll._order_by = "RANDOM()"

Possible queue names here: https://github.com/vladkens/twscrape/blob/main/twscrape/api.py#L11

davinkevin commented 5 months ago

Thank you for the solution, it works like a charm!
