backblaze-b2-samples / cloudflare-b2

Provide access to a private Backblaze B2 bucket via a Cloudflare Worker
Apache License 2.0
37 stars 17 forks source link

Cloudflare Worker for Backblaze B2

Provide access to one or more private Backblaze B2 buckets via a Cloudflare Worker, so that objects in the bucket may only be publicly accessed via Cloudflare. The worker must be configured with a Backblaze application key with access to the buckets you wish to expose.

Informal testing suggests that there is negligible performance overhead imposed by signing the request.

Download the Source Code

git clone git@github.com:backblaze-b2-samples/cloudflare-b2.git
cd cloudflare-b2

You must also install dependencies before you can deploy or publish the worker:

npm install

Worker Configuration

Copy wrangler.toml.template to wrangler.toml and configure B2_APPLICATION_KEY_ID, B2_ENDPOINT and BUCKET_NAME. You may also configure ALLOWED_HEADERS to restrict the set of headers that will be signed and included in the upstream request to Backblaze B2, and RCLONE_DOWNLOAD to use the worker with rclone's --b2-download-url option.

[vars]
B2_APPLICATION_KEY_ID = "<your b2 application key id>"
B2_ENDPOINT = "<your S3 endpoint - e.g. s3.us-west-001.backblazeb2.com >"
# Set BUCKET_NAME to:
#   "A Backblaze B2 bucket name" - direct all requests to the specified bucket
#   "$path" - use the initial segment in the incoming URL path as the bucket name
#           e.g. https://images.example.com/bucket-name/path/to/object.png
#   "$host" - use the initial subdomain in the hostname as the bucket name
#           e.g. https://bucket-name.images.example.com/path/to/object.png
BUCKET_NAME = "$path"
# Backblaze B2 buckets with public-read visibility do not allow anonymous clients
# to list the bucket’s objects. You can allow or deny this functionality in the
# Worker via ALLOW_LIST_BUCKET
ALLOW_LIST_BUCKET = "<true, if you want to allow clients to list objects, otherwise false>"
# If set, these headers will be included in the signed upstream request
# alongside the minimal set of headers required for an AWS v4 signature:
# "authorization", "x-amz-content-sha256" and "x-amz-date".
#
# Note that, if "x-amz-content-sha256" is not included in ALLOWED_HEADERS, then
# any value supplied in the incoming request is discarded and
# "x-amz-content-sha256" will be set to "UNSIGNED-PAYLOAD".
#
# If you set ALLOWED_HEADERS, it is your responsibility to ensure that the
# list of headers that you specify supports the functionality that your client
# apps use, for example, "range". The list below is a suggested starting point.
#
# Note that HTTP headers are not case-sensitive. "host" will match "host",
# "Host" and "HOST".
RCLONE_DOWNLOAD = "<true, if you are using the Worker to proxy downloads for rclone, otherwise false>"
# If set, the worker will strip the `file/` prefix from incoming request paths.
# See https://rclone.org/b2/#b2-download-url
#ALLOWED_HEADERS = [
#    "content-type",
#    "date",
#    "host",
#    "if-match",
#    "if-modified-since",
#    "if-none-match",
#    "if-unmodified-since",
#    "range",
#    "x-amz-content-sha256",
#    "x-amz-date",
#    "x-amz-server-side-encryption-customer-algorithm",
#    "x-amz-server-side-encryption-customer-key",
#    "x-amz-server-side-encryption-customer-key-md5"
#]

You must also configure B2_APPLICATION_KEY as a secret:

echo "<your b2 application key>" | wrangler secret put B2_APPLICATION_KEY

Running in Wrangler's Local Server

Wrangler's local server loads configuration from wrangler.toml, but cannot access secrets. Instead, the local server loads additional configuration from .dev.vars.

Copy .dev.vars.template to .dev.vars and configure B2_APPLICATION_KEY:

# Configuration for running the app in local dev mode
B2_APPLICATION_KEY = "<your b2 application key>"

Passing the Bucket Name

Set BUCKET_NAME to:

If you are using the default *.workers.dev subdomain, you must either specify a bucket name in the configuration, or set BUCKET_NAME to $path and pass the bucket name in the path.

Note that, if you use the $host configuration, you must configure a Route or a Custom Domain for each bucket name. You cannot simply route *.my.domain.com/* to your worker.

Restricting Signed HTTP Headers in the Upstream Request

By default, all HTTP headers in the downstream request from the client are signed and included in the upstream request to Backlaze B2, except the following:

If you wish to further restrict the set of headers that will be signed and included, you can configure ALLOWED_HEADERS in wrangler.toml. If ALLOWED_HEADERS is set, then the listed headers will be included in the signed upstream request alongside the minimal set of headers required for an AWS v4 signature: authorization, x-amz-content-sha256 and x-amz-date.

Note that, if x-amz-content-sha256 is not included in ALLOWED_HEADERS, then any value supplied in the incoming request will be discarded and x-amz-content-sha256 will be set to UNSIGNED-PAYLOAD in the outgoing request.

If you do set ALLOWED_HEADERS, it is your responsibility to ensure that the list of headers that you specify supports the functionality that your client apps use, for example, range for HTTP range requests. The list below, the HTTP headers listed in the AWS S3 GetObject documentation currently supported by Backblaze B2, is a suggested starting point:

ALLOWED_HEADERS = [
    "content-type",
    "date",
    "host",
    "if-match",
    "if-modified-since",
    "if-none-match",
    "if-unmodified-since",
    "range",
    "x-amz-content-sha256",
    "x-amz-date",
    "x-amz-server-side-encryption-customer-algorithm",
    "x-amz-server-side-encryption-customer-key",
    "x-amz-server-side-encryption-customer-key-md5"
]

Note that HTTP headers are not case-sensitive. host will match host, Host and HOST.

Rclone Custom Endpoint for Downloads

Rclone's B2 integration includes an option to specify a custom endpoint for downloads: --b2-download-url. Given such a custom endpoint, rather than reading files directly via the B2 Native API, Rclone reads them from that endpoint.

If you wish to use the Rclone custom endpoint feature with this Worker, you must set the RCLONE_DOWNLOAD environment variable to true in wrangler.toml or the Cloudflare Dashboard:

RCLONE_DOWNLOAD = "true"

Rclone assumes that the Cloudflare endpoint is proxying the B2 Native API, which supports "friendly" download URLs of the form https://f000.backblazeb2.com/file/bucket-name/path/to/file.txt. So, given the custom endpoint https://mysubdomain.mydomain.tld, Rclone requests a URL such as https://mysubdomain.mydomain.tld/file/bucket-name/path/to/file.txt.

Bucket Configuration

Since the bucket is private, the Cloudflare Worker signs each request to Backblaze B2 using the application key, and includes the signature in the request’s Authorization HTTP header. By default, Cloudflare does not cache content where the request contains the Authorization header, so you must set your bucket’s info to include a cache-control directive.

Wrangler

You can use this repository as a template for your own worker using wrangler:

wrangler generate projectname https://github.com/backblaze-b2-samples/cloudflare-b2

Serverless

To deploy using serverless add a serverless.yml file.

Range Requests

When the worker forwards a range request for a large file (bigger than about 2 GB), Cloudflare may return the entire file, rather than the requested range. The worker includes logic adapted from this Cloudflare Community reply by julian.cox to abort and retry the request if the response to a range request does not contain the content-range header.

Acknowledgements

Based on https://github.com/obezuk/worker-signed-s3-template