buchgr / bazel-remote

A remote cache for Bazel
https://bazel.build
Apache License 2.0
607 stars 156 forks source link

[Azure Blob Storage] Not picking up blobs after restart #627

Open fineconstant opened 1 year ago

fineconstant commented 1 year ago

We use bazel-remote with Azure Blob Storage as a backend and currently have some cache stored in a Storage Container use by bazel-remote. After restarting bazel-remote service it seems not to be picking up already stored cache files.

{
  "CurrSize": 0,
  "UncompressedSize": 0,
  "ReservedSize": 0,
  "MaxSize": 21474836480,
  "NumFiles": 0,
  "ServerTime": 1673265169,
  "GitCommit": "b899a9d0f9d086a25e9ec51ccba8dafbc36ca8eb",
  "NumGoroutines": 621
}

This results in cache miss and unnecessary growth of size of used storage account.

mostynb commented 1 year ago

The intended behaviour is that when bazel-remote has a cache miss in its filesystem cache, then it checks if the proxy backend has that blob. The reported statistics from the /status http endpoint refer only to the filesystem cache, not the proxy backend.

So if you restart bazel-remote, it should start with whatever blobs are stored in its cache dir. Are you cleaning this filesystem cache between builds? (Or running bazel-remote in builds inside containers that start with an empty cache dir?)

fineconstant commented 1 year ago

Thank you for clarifying that 😄 it makes sense now. We run bazel-remote using Kubernetes and we DO NOT set the --dir flag, only configuring Azure Blob.

--max_size=$(cache-max-size)
--azblob.tenant_id=$(azblob-tenant-id)
--azblob.storage_account=$(azblob-storage-name)
--azblob.container_name=bazel-cache
--azblob.auth_method=shared_key
--azblob.shared_key=$(azblob-shared-key)
--htpasswd_file=$(htpasswd-location)

Bazel itself also runs on a separate fresh container so its cache dir is empty. Checking bazel-remote logs I saw that as soon as it receives PUT request it stores the cache in Azure Blob so I guess everything is working correctly. I was just confused by this reported CurrSize 😄

mostynb commented 1 year ago

Note that the --dir flag is required, I think bazel-remote will refuse to run without it.

So in your case I suspect that you probably have a BAZEL_REMOTE_DIR environment variable set? See the warning in the "Kubernetes notes" in README.md about not naming your kubernetes deployment "bazel-remote".