Alluxio / alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud
https://www.alluxio.io
Apache License 2.0
6.86k stars 2.94k forks source link

alluxio aync_through to ceph,files never persist to ceph #14685

Open lilyzhoupeijie opened 2 years ago

lilyzhoupeijie commented 2 years ago

Alluxio Version: alluxio 2.6.0,alluxio 2.6.1,alluxio 2.8.0

Describe the bug some alluxio worker crash,and files in the crashed worker are marked as missing,and after some times,the worker is back online,but the files will always be lost with 100% in alluxio and will never persist to ceph such as follows:

To Reproduce stop one alluxio worker,and when the files are lost,put the alluxio worker to online

Expected behavior when the files are lost,but 100% in alluxio,it should be turn to to_be_persist status,and persist to ceph.

图片

Are you planning to fix it in process

Searching the project,I found there are some logic to turn files to LOST in the class LostFileDetector, if (inode.getPersistenceState() != PersistenceState.PERSISTED) { mInodeTree.updateInode(journalContext, UpdateInodeEntry.newBuilder() .setId(inode.getId()) .setPersistenceState(PersistenceState.LOST.name()) .build());

but there is never some logic which turns files form LOST TO NOT_PERSISTED,cause the worker has been back online. I'm not sure if it is suitable to put logic which turns files from LOST to PersistenceState.NOT_PERSISTED int just this class,or put this logic somewhere else will be more suitable?

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.