osome-iu / botometer-python

A Python API for Botometer by OSoMe
https://botometer.osome.iu.edu
MIT License
371 stars 59 forks source link

Code is barely able to use 20% of the full botometer-pro API #39

Closed abaheti95 closed 4 years ago

abaheti95 commented 4 years ago

I have a huge amount of data to collect therefore I created 4 botometer pro APIs. For all 4 of them I am using consumer keys from different twitter users. However, when I am trying to track per day API calls it is barely touching 4k (approx 12% usage of pro quota) calls which is less than even the freemium version. What is the possible reason for this? How should I change the code so that I can use the 100% of botometer pro API?

My code looks like this:

# Pro API endpoint
botometer_api_url = 'https://botometer-pro.p.mashape.com'
twitter_app_auth = {        
        'consumer_key': "xxxxxxxxxxxxxxxxxxxxx",
        'consumer_secret': "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
}
api_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
bom = botometer.Botometer(botometer_api_url=botometer_api_url,
                          mashape_key=api_key,
                          wait_on_ratelimit=True,
                          **twitter_app_auth)
for id in list_of_ids:
        result = bom.check_account(id)
        # save result
abaheti95 commented 4 years ago

I have been downloading with this code for days now and it seems that I am able to only get ~30% of the users per day. That is equivalent to the Freemium version of the Pro API. How can I speed up this process? Kindly help!

yang3kc commented 4 years ago

We think it's a combination of Internet IO and slow server. Not much we can do with the IO part. For the server part, we are having higher than usual daily request volume lately and our server is overloaded. As a result, it takes longer for the server to respond and timeout rate is increased (10% now). We just got a new and much more powerful server, but it might take weeks before we can migrate the service.

We are also working on updating the machine learning model, and our plan is to deploy the new model directly to the new server. But we can't guarantee when this will be done, because there are lots of research and tests we need to do to make sure the new model is better and reliable.

In short, we are sorry, but the situation is unlikely to improve in the next a couple of weeks. But once we are done with the upgrade, it will be faster and generate more accurate/informative results.

phui commented 4 years ago

For check_accounts_in, perhaps we can use asyncio, concurrent, or simply threading?

Examples:

asyncio : https://stackoverflow.com/a/57689101/2593536

concurrent : https://stackoverflow.com/a/46144596/2593536

threading : https://stackoverflow.com/a/2635066/2593536

yang3kc commented 4 years ago

We ended the free API to relieve the heavy load on our server. Should be better now.