Alluxio / alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud
https://www.alluxio.io
Apache License 2.0
6.87k stars 2.94k forks source link

Deleting a directory failed because of rename #18667

Closed gp1314 closed 2 months ago

gp1314 commented 3 months ago

Alluxio Version: 2.9.3

Describe the bug

  1. write A/B/01 file
  2. rename A/ to C/
  3. Occasionally A/B is found to exist in ufs
  4. delete A/ ,throw exception DirectoryNotEmptyException: Failed to delete 1 paths from the under file system: A/B (UFS dir not in sync. Sync UFS, or delete with unchecked flag.)
  1. Since the write is ASYNC_THROUGH, the asynchronous synchronization to ufs begins after the write is finished
  2. When the client writes a rename file, the asynchronous synchronization task may still be executed by the jobworker
  3. The rename operation lists the files in A/ directory,and copy A/B to C/B ,and delete A/B
  4. The asynchronous synchronization task may fail. Trying again
  5. The synchronous task of the job worker will create A/B directories and copy data to the UFS. If the data copy fails, the A/B directories will not be deleted, resulting in the A/B directories not existing in the Alluxio metadata but still being present on the UFS.
  6. When the client lists, it will synchronize directory A from the UFS to Alluxio metadata. If it finds that directory A should not exist and attempts to delete it, but directory A still contains directory B, which does not exist in Alluxio but exists in the UFS, an exception will be thrown during the delete operation.

To Reproduce

  1. write A/sub-[i]/file-[i],write 200 files, 200 directories
  2. rename A/ to C/
  3. list Parent directory of the C/ directory
  4. found A/ directory, delete A/ directory

Expected behavior

Urgency urgency

Are you planning to fix it

YichuanSun commented 3 months ago

Are you planning to fix it? It is an existing bug, we recommend to use CACHE_THROUGH as you did.