Backblaze / b2-sdk-java

The official Java SDK for using Backblaze's B2 Storage APIs
Other
93 stars 26 forks source link
production
Status
Build Status
License

INTRO

b2-sdk-java is a Java sdk for B2 clients. See the B2 API docs to get some context.

STATUS

The SDK is being used in production at Backblaze. All new internal b2 client code is being written with it. We are in the process of slowly transitioning our existing code to the library.

We don't expect to make many incompatible changes to the API or behavior in the near future.

FEATURES

SAMPLE

HOW TO USE

FAQ

STRUCTURE

This section is mostly for developers who work on the SDK.

Layering

To simplify implementation and testing, the B2StorageClient has three main layers and a few helpers.

The top-most layer consists of the B2StorageClientImpl and the various Request and Response classes. This layer provides the main interface for developers. The B2StorageClientImpl is responsible for acquiring account authorizations, upload urls and upload authorizations, as needed. It is also responsible for retrying operations that fail for retryable reasons. The B2StorageClientImpl uses a B2Retryer to do the retrying. An implementation of the B2RetryPolicy controls the number of retries that are attempted and the amount of waiting between attempts; the B2DefaultRetryPolicy follows our recommendations and should be suitable for almost all users. A few operations are complicated enough that they are handled by a separate class; the most prominent example is the B2LargeFileUploader.

The middle layer, consists of the B2StorageClientWebifier. The webifier's job is to translate logical B2 API calls (such as "list file names", or "upload a file") into the appropriate HTTP requests and to interpret the responses. The webifier isolates the B2StorageClientImpl from having to do this mapping. We stub the webifier layer to test B2StorageClientImpl.

The bottom layer is the B2WebApiClient. It provides a few simple methods, such as postJsonReturnJson(), postDataReturnJson(), and getContent(). We stub B2WebApiClient to test the B2StorageClientImpl. This layer isolates the rest of the SDK from the HTTPS implementation so that developers can provide their own web client if they want. The b2-sdk-httpclient jar provides an implementation that uses the Apache HttpClient.

One of the main helpers is our B2Json class. It uses annotations on class members and constructors to convert between Java classes and JSON.

Caching

Each B2StorageClientImpl has a B2AccountAuthorizationCache and a B2UploadUrlCache.

Whenever the SDK needs an account authorization, it gets it from the B2AccountAuthorizationCache. If the cache doesn't have one, it will use its B2AccountAuthorizer to get one. When there's an authorization error, the B2Retryer clears the cache so a new authorization will be fetched the next time it's needed.

Whenever the SDK needs an upload url it gets one from its B2UploadUrlCache, which can hold multiple urls (and upload auth tokens) per bucket. When there's no upload url for a given bucket, the cache requests one from the B2 service. When the SDK uses one of them, it is removed from the cache. If the upload succeeds, the url is put back in the cache for later use. If the upload fails, the url is not put back in the cache, so it will not be reused. The B2LoadFileUploader, which encapsulates large file uploads, uses an instance of B2UploadPartUrlCache in a similar fashion.

Retrying

B2 clients may need to retry operations while using the B2 API. The most common time is when an upload is interrupted and the client needs to fetch a new upload URL and try again. Long-running B2 clients may also have their B2 authTokens expire; when that happens, the client needs to reauthenticate. Additionally, network issues between the client and the B2 service can cause errors that should be retried. Periodically, there are B2 service issues which could result in retryable errors. The SDK handles most of this retrying transparently to the client.

For each high-level call, the B2StorageClient uses an instance of B2Retryer; each B2Retryer is given a B2RetryPolicy object. When the retrier catches an error, it clears cached authTokens (and upload urls are not put back in the upload url cache). The retryer uses information in the error objects to categorize errors as unretryable, retryable immediately, or retryable after a delay. The B2Retryer then notifies the B2RetryPolicy what has happened and, for retryable errors and takes the policy's guidance about whether to retry and, for retries that need a delay, how long to sleep.

SDK users can customize their retry policy by providing a B2RetryPolicy factory. By default, a factory that returns instances of B2DefaultRetryPolicy is used. The B2DefaultRetryPolicy implements the policy described in the B2 documentation using the retry specified in the B2 response or an exponential backoff if none is provided.

TESTING

We have lots of unit tests to verify that each of the classes performs as expected. As mentioned in the "Structure" section, we stub lower layers to test the layer above it. The B2WebApiClient doesn't have unit tests yet (volunteers?). We exercise the whole client, including the B2WebApiClient during development and during the official builds by running B2Sample against the B2 servers.

I'd also like to test more with InterruptedException.

I'd like to verify that it's possible to replace the B2WebApiClient implementation in an environment that doesn't have the Apache HttpClient we use. I want to be sure we're not inadvertently pulling in classes that won't exist in such an environment. The first step of this was to remove the HttpClient implementation from the core jar. (We would be interested in an implementation that uses built-in java classes instead of HttpClient to reduce external dependencies.)

For developers who are building on the SDK, we have a provided an initial implementation of B2StorageClient which simulates the service. So far, it has a minimal feature set. Let us know if you'd like to work on it. (Actually, it's not in the repo yet.)

Eventual Development TO DOs

Here are some things we could do someday, in no particular order:

Potential future features

Contributors

In addition to the team at Backblaze, the following people have contributed to the SDK: