Open austindonnelly opened 2 weeks ago
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage.
NB: this turns into a performance problem, because once there are 22 millions files in a single directory, creating a new file as part of a new transfer takes some time. For us, just deleting the files moved us from about 1.5 Gbps to 4 Gbps.
There's code which looks like it knows how to delete an entry: TryRemoveStoredTransferAsync()
But this is never called from anything other than test code!
Library name and version
Azure.Storage.DataMovement.Blobs 12.0.0-beta.6
Describe the bug
When checkpointing is enabled for a transfer from local files to blob storage, the checkpoint files are not deleted once the transfer completes. This means they keep accululating. I've just deleted over 22 million files from .azstoragedml (about 9 GiB) created over several weeks of sustained transfers.
Expected behavior
I expect the checkpoint files to be deleted once a transfer has completed. I can see perhaps for a failed transfer, you'd want to keep them in case the transfer should be re-attempted. But for a successful transfer, the checkpoint files should go. Otherwise they just keep accumulating.
Actual behavior
Checkpoint files stay
Reproduction Steps
Run a transfer from local disk to blob storage. Start with an empty .azstoragedml directory. Once transfer has completed, check contents of .azstoragedml directory Notice there are still files there. Bug. There should be none.
Environment
Windows Server 2019 .NET Framework 4.7.1 Visual Studio 2022 (17.11.5)