linkedin / ambry

Distributed object store
https://github.com/linkedin/ambry/wiki
Apache License 2.0
1.74k stars 275 forks source link

[vcr-2.0] Delete blob eagerly #2836

Closed snalli closed 1 month ago

snalli commented 1 month ago

Implementing immediate blob deletion from the cloud once the VCR receives a DEL message from the server via replication. This approach offers several advantages over our current method, where instead of deleting the blob outright, we currently add a delete-timestamp to Azure blob metadata. Compaction then deletes the blob later based on this timestamp.

The benefits of the new approach include:

  1. Proactively reclaiming space in Azure.
  2. Reducing the number of network calls required to delete a blob; currently, it involves 2 network calls to attach a timestamp and another 2 network calls from compaction to remove the blob.
  3. Decreasing the size of the response from the list-metadata API used by compaction to list all blobs in an Azure partition.

Based on customer usage patterns, few customers utilize the undelete feature. If the VCR receives an undelete request after a blob has been removed from the cloud, it will be re-uploaded to the cloud using the replication protocol.

This patch also removes some unused legacy code