Backblaze / b2-sdk-python

Python library to access B2 cloud storage.
Other
184 stars 61 forks source link

cannot authorize account #343

Closed tasiotas closed 2 years ago

tasiotas commented 2 years ago

Hi,

I use SqliteAccountInfo("~/.b2_account_info") for storing account info. When I call B2 API I get this error:

I thought InMemoryAccountInfo() is not thread safe, but the one I use is fine. Im not even sure if its threading issue. It takes 9 minutes to timeout...

Any clues? Thank you


INFO 2022-08-09 21:56:49,150 b2http 432 140561966417664 Pausing thread for 1 seconds because that is what the default exponential backoff is
INFO 2022-08-09 21:57:04,510 b2http 432 140561974810368 Pausing thread for 1 seconds because that is what the default exponential backoff is
INFO 2022-08-09 21:58:36,671 b2http 432 140561995003648 Pausing thread for 1 seconds because that is what the default exponential backoff is
INFO 2022-08-09 21:58:41,791 b2http 432 140561986610944 Pausing thread for 1 seconds because that is what the default exponential backoff is
INFO 2022-08-09 21:59:02,270 b2http 432 140561966417664 Pausing thread for 1 seconds because that is what the default exponential backoff is
INFO 2022-08-09 21:59:17,630 b2http 432 140561974810368 Pausing thread for 1 seconds because that is what the default exponential backoff is
INFO 2022-08-09 22:00:49,790 b2http 432 140561995003648 Pausing thread for 1 seconds because that is what the default exponential backoff is
INFO 2022-08-09 22:00:54,911 b2http 432 140561986610944 Pausing thread for 1 seconds because that is what the default exponential backoff is
INFO 2022-08-09 22:01:15,391 b2http 432 140561966417664 Pausing thread for 2 seconds because that is what the default exponential backoff is
INFO 2022-08-09 22:01:30,750 b2http 432 140561974810368 Pausing thread for 2 seconds because that is what the default exponential backoff is
INFO 2022-08-09 22:03:02,911 b2http 432 140561995003648 Pausing thread for 2 seconds because that is what the default exponential backoff is
INFO 2022-08-09 22:03:08,031 b2http 432 140561986610944 Pausing thread for 2 seconds because that is what the default exponential backoff is
INFO 2022-08-09 22:03:28,510 b2http 432 140561966417664 Pausing thread for 3 seconds because that is what the default exponential backoff is
INFO 2022-08-09 22:03:43,870 b2http 432 140561974810368 Pausing thread for 3 seconds because that is what the default exponential backoff is
INFO 2022-08-09 22:05:16,030 b2http 432 140561995003648 Pausing thread for 3 seconds because that is what the default exponential backoff is
INFO 2022-08-09 22:05:21,151 b2http 432 140561986610944 Pausing thread for 3 seconds because that is what the default exponential backoff is
Connection error: HTTPSConnectionPool(host='api002.backblazeb2.com', port=443): Max retries exceeded with url: /b2api/v2/b2_list_buckets (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fd7217c14f0>: Failed to establish a new connection: [Errno 110] Connection timed out'))```
ppolewicz commented 2 years ago

I don't know why, but the server does not accept requests you send it, thinking there are so many requests for this account that it must protect itself against any more traffic. It looks like you may be structuring the sessions in your program in a way that is not really compatible with how the protocol was designed. I could probably help you if you could describe what your program is doing.

Hitting a race condition in InMemoryAccountInfo is quite hard - it's mostly a theoretical issue. Implementation of a safe InMemoryAccountInfo is fairly small amount of work so if you'd be interested in contributing that, we could guide you.

tasiotas commented 2 years ago

Looking closely at the error, its failing at reaching /b2api/v2/b2_list_buckets. I do remember no allowing my API key to list buckets... Could that be the issue? image

This issue comes and goes, so its bit hard to repro it on my end.

I am totally ok to start using InMemoryAccountInfo instead. According to docs, I got an impression that for server (Django) I would be better off with Sqlite solution, but difference between them wasn't very clear.

Im running this in Django Model class.

info = SqliteAccountInfo("~/.b2_account_info")
b2_api = B2Api(info)
application_key_id = env("BACKBLAZE_PRIVATE_KEY_ID")
application_key = env("BACKBLAZE_PRIVATE_KEY")
b2_api.authorize_account("production", application_key_id, application_key)
bucket = b2_api.get_bucket_by_name(env("BACKBLAZE_PRIVATE_BUCKET"))
token = bucket.get_download_authorization(self.filepath, 10)
url = bucket.get_download_url(self.filepath)
return url
tasiotas commented 2 years ago

so unfortunately im gettng the same result even if I switch to

info = InMemoryAccountInfo()

And allowing API key to list all buckets also didn't do much difference.

If that was simply a bad request to API, shouldn't it return with error code immediately? 10mins to timeout is not really a timeout ;)

ppolewicz commented 2 years ago

What I think is happening is that your account is being flooded with requests while you are executing this test.

Please create a fresh B2 account (maybe in a different region, so that one day you can use it to set up replication) and try to reproduce the issue on that new account to confirm if my guess is right. You can set it up on an address like login+something@gmail.com to get it on the same email.

tasiotas commented 2 years ago

I was thinking if API is too busy when I try to reach it or am I being rate limited. So while its happening on my dev server (ubuntu via Docker), I tried to auth from another console with basic request

import requests

r = requests.get(
    "https://api.backblazeb2.com/b2api/v2/b2_authorize_account",
    auth=("keyid", "key"),
)
print(f"r: {r}")

I do get 200, while my server is hanging. This indicated that API endpoint is fine. Could it be Django related? Django/server reboot doesn't help, so I don't think its threading issue...

ppolewicz commented 2 years ago

perhaps you flooded b2 with a ton of requests from your server and they blocked you?

Try from a different server maybe?

tasiotas commented 2 years ago

I don't think I flooded them and got limited. I have a feeling that sometimes API is simply no available, returning 500. That's fine. I'll tell my customers to try in a few mins later.

I'll switch to sending requests manually, so I can get response immediately without waiting for this lib to exit after 10mins. Would be cool to have an option to disable retries.

Thank you

ppolewicz commented 2 years ago

Retry management is something we could improve for sure, though most users who spoke about it in github issues would like the operation to complete without errors even if it takes an extra hour.

You can still use b2-sdk-python but change only the retry logic - that should be easier than issuing requests by yourself since you would then need to take care of a lot of other stuff by yourself (token management, handling 429, wrapping errors etc)

I don't know what triggered the situation you've encountered, but B2 as a service is used by a ton of people and if the API sometimes broke for a long time I think we would hear about it :) It does look like the client got some 429 and some TCP level timeouts and this is not typically what I've seen the server do, also you've asked about thread safety which made me believe you may have been detected by some rate limiting system.

If you will figure out one day what it was, let us know!