kothar / go-backblaze

A golang client for Backblaze's B2 storage
MIT License
95 stars 31 forks source link

Long-running B2 objects eventually throw UNAUTHORIZED on all requests #3

Closed encryptio closed 8 years ago

encryptio commented 8 years ago

It appears that B2's auth tokens are time-limited; I'm not sure what the time is, and it's not documented (in an earlier draft of the API docs, it was mentioned that they needed to document when users needed to regenerate auth tokens.)

In my application where a B2 instance exists for days at a time, the current implementation eventually returns the UNAUTHORIZED error on every single request, making the B2 object unusable until it is recreated (though likely just calling AuthorizeAccount on it again would fix it.)

I think the solution that would be really nice is something similar to what (*github.com/ncw/swift.Connection).Call() does - before each request, make sure we're authenticated. If we don't have an auth token, we authenticate (and fail the overall operation if that fails). If we get a 401 during a non-authentication operation, we clear our (now known bad) auth token and retry (and thus reauthenticate.) That's worked well in practice for the companies I've used it at before. (Though note that the current implementation in ncw/swift is a little wonky.)

I was going to draft a pull request for this, but it requires some pretty serious refactors and a few (fully backwards-compatible) API changes (specifically that AuthorizeAccount would no longer be a required call; effectively turning the B2 object stateless, which is nice on its own sake.)

If you don't think you have the time to do this, I can try my hand at it during my free time.

kothar commented 8 years ago

That sounds like a good idea, I have some time to clean up a few projects before new year so I will have a go at this.

How do you see the error being handled when the token first expires? Should it be silently retried (including reauth) at least once, or returned to the caller to handle the retry?

Retry should be possible to add in a single place where the request is made, so I could add a switch to enable/disable that behaviour.

encryptio commented 8 years ago

I think silent retry once would be best for most calls, otherwise all calls to the library will be wrapped, which then in my mind necessitates a wrapper object that just does retries (which, at that point, should just be in the library itself.)

I think UploadHashedFile is the one exception here; since you could have consumed some of the body during your initial request, you can't reuse the body argument as-is on a retry. Assuming UploadHashedFile returns some ErrTryAgain like thing, UploadFile (and things like it) would get the obvious loop. (I think keeping UploadHashedFile around as a raw interface that doesn't do full buffering is good; it's currently the only way to send any generic io.Reader without reading it into memory.) It's a little ugly that way, but UploadHashedFile is already a low-level interface.

kothar commented 8 years ago

@encryptio I have implemented an auto-retry behaviour in PR #4 - can you take a look and see if it covers all your use cases? The main location for retry is apiRequest. All API requests go through here apart from the file upload/download methods. Some retry behaviour still needs to be added to these as they make requests directly through the post and get methods, so I'll do a bit more work on getting those in place.