Open computermacgyver opened 4 years ago
Any updates on this enhancement, or ideas for a workaround? I am very interested in getting this to work. If you could give me a pointer on where to start, I could potentially implement it.
Hi @JanaLasser . We haven't done this work.
We would need to first write a function that takes a list of user ids or screen names and checks them with the
Twitter API using the /1.1/users/lookup.json
end point. This is documented here:
https://developer.twitter.com/en/docs/twitter-api/v1/accounts-and-users/follow-search-get-users/api-reference/get-users-lookup
It accepts up to 100 users at a time.
After that we would download the profile images and then transform the data to be ready for processing. Functions for these exist but are single threaded; so, may be slow. I would leave them for now, however, and focus on the first step of using the /users/lookup.json
endpoint.
I created a pull request (https://github.com/euagendas/m3inference/pull/30) where I implemented the changes. I hope this is the right way (first time ever pull request...).
So far there is only code for user ID lists (not user name lists). The code does handle lists with >100 IDs by chunking them into bits of 100 IDs each.
It also doesn't explicitly respects the API rate limit and will fail with an "Invalid response from Twitter" if the rate limit is exceeded (similar to the single user lookup).
Thank you @JanaLasser for the PR. It looks nice and I left a few comments there. I do not have a set of API keys handy -- it would be fantastic if @computermacgyver could help test when these comments were resolved.
Currently the
infer_screen_name
andinfer_id
methods in M3Twitter accept one screen-name/id and call the Twitter API to get information for that single user. This is inefficient since the endpoint can get up to 100 users at a time.New methods should be included in the M3Twitter class to handle a long list of users. These methods should break the list into chunks of 100, respect the rate limit, and gracefully handle any API errors.
(This was previously not needed as the class was scraping profiles from HTML and was designed simply as a demonstration method rather than something to be used at scale. The change recently made to use the API opens up this opportunity, which would make the library even more user-friendly)