Azure / azure-sdk-for-rust

This repository is for active development of the *unofficial* Azure SDK for Rust. This repository is *not* supported by the Azure SDK team.
MIT License
696 stars 241 forks source link

Unable to delete empty "directory" in Azure Blob Storage #1522

Open barabadzhi opened 9 months ago

barabadzhi commented 9 months ago

Description:

I have a blob container in storage account. It emulates the storage hierarchy. There is a "folder" named files that contains several other "folders", including emptyDir.

It is assumed, that code like this one should delete the "folder" (empty or not) & all of its "contents".

container
  .blob_client("files/emptyDir")
  .delete()
  .await?;

Instead, it removes all the "files" inside, but the "folder" is still there.

demoray commented 9 months ago

I can't speak to hierarchical namespaces and how it's used. Do you have that feature enabled on your storage account?

However, standard storage accounts do not have directories. It's a synthetic construct based on blobs with '/' in the name.

barabadzhi commented 9 months ago

Hi @demoray, Thanks for your comment. Yes, I do use the hierarchical namespaces, and yes, I do understand that they're an abstraction. The issue is not about the platform usage, rather about a concrete misbehavior of the library.

On a default storage account with hierarchical namespaces:

  1. Create files/C/Carrot & files/C/Cat.
  2. Removing Carrot
    container
    .blob_client("files/C/Carrot")
    .delete()
    .await?;

    and Cat

    container
    .blob_client("files/C/Cat")
    .delete()
    .await?;

    works fine, files/C is reported/shown as "empty" afterwards.

    1. However removing files/C, "empty" or not, does not remove anything, while not failing with an error either. This is an issue.
      container
      .blob_client("files/C")
      .delete()
      .await?;
demoray commented 9 months ago

Please note my comment about directories being entirely synthetic was regarding non-hierarchical storage accounts, which is the default when creating storage accounts.

I investigated how the Azure Portal handles deleting directories, as I was unable to find this in the rest API spec. the portal makes a REST API call to:

DELETE https://STORAGE_ACCOUNT.dfs.core.windows.net/CONTAINER/PATH?recursive=true

demoray commented 9 months ago

I was successfully able to replicate the issue you're experiencing with the hierarchical feature.

Using the azure_storage_datalake client, I was able to successfully delete the directories.

    let client = DataLakeClient::new(account, storage_credentials).file_system_client(container_name).get_directory_client(path).delete(recursive).await?;

For the future, we should probably investigate how the other SDKs handle this and mirror their implementation. However, you should be able to use the azure_storage_datalake client to delete the paths.

barabadzhi commented 9 months ago

Thank you again, Brian. I didn't check the REST API myself, but wired this case is missing. I'm afraid other implementations may have similar issues as it seems to me it may be an overlook on the API side. 🤔

May check what is the behavior with node/python impl later.

Leaving the issue open until we know more.

barabadzhi commented 8 months ago

I was successfully able to replicate the issue you're experiencing with the hierarchical feature.

Using the azure_storage_datalake client, I was able to successfully delete the directories.

    let client = DataLakeClient::new(account, storage_credentials).file_system_client(container_name).get_directory_client(path).delete(recursive).await?;

For the future, we should probably investigate how the other SDKs handle this and mirror their implementation. However, you should be able to use the azure_storage_datalake client to delete the paths.

Unfortunately, not in my case. The delete operation from file_system_client constantly cases stack overflow for me (recursive or not). Getting the properties, etc. works fine. Looks like another API change on Azure side that broke the hack. 🤔