Alluxio / alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud
https://www.alluxio.io
Apache License 2.0
6.81k stars 2.94k forks source link

Alluxio throws FileDoesNotExistException when cannot access UFS may cause un-persisted files lost #16236

Open secfree opened 1 year ago

secfree commented 1 year ago

Alluxio Version: 2.7.1

Describe the bug The file exists in Alluxio, but the client gets FileDoesNotExistException from AlluxioMaster when AlluxioMaster cannot access UFS.

To Reproduce

  1. Set "alluxio.user.file.metadata.sync.interval=0" to trigger meta sync from UFS
  2. Mount a hdfs path to alluxio
  3. Access the path in alluxio to sync meta data to alluxio
  4. Transit the hdfs namenode to standby state
  5. Access the path in alluxio again and get the FileDoesNotExistException

    $ ./bin/alluxio fs ls /dev2_mnt/zhaoqun.deng/220926
    Path "/dev2_mnt/zhaoqun.deng/220926" does not exist.

Expected behavior Alluxio returns the existing status, but not throw FileDoesNotExistException.

Urgency

Urgent as this may cause the user think there is data loss (the file is missing).

Are you planning to fix it Yes

yyongycy commented 1 year ago

could you please paste the master's DEBUG log? That would be more clear.

secfree commented 1 year ago

could you please paste the master's DEBUG log? That would be more clear.

Let me share my trace results.

  1. alluxio fs ls triggers HdfsUnderFileSystem.getStatus which has the following log

    2022-09-26 11:56:38,310 INFO  RetryInvocationHandler - org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
        at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:108)
        at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2076)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1422)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3001)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1228)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:894)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        ...
    , while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over xxxx:8020 after 1 failover attempts. Trying to failover after sleeping for 880ms.
  2. HdfsUnderFileSystem.getStatus throws the RemoteException after completing failover attempts

  3. The call stack of HdfsUnderFileSystem.getStatus is

    @alluxio.underfs.hdfs.HdfsUnderFileSystem.getStatus()
        at alluxio.underfs.BaseUnderFileSystem.getFingerprint(BaseUnderFileSystem.java:104)
        at alluxio.underfs.UnderFileSystemWithLogging$25.call(UnderFileSystemWithLogging.java:568)
        at alluxio.underfs.UnderFileSystemWithLogging$25.call(UnderFileSystemWithLogging.java:565)
        at alluxio.underfs.UnderFileSystemWithLogging.call(UnderFileSystemWithLogging.java:1212)
        at alluxio.underfs.UnderFileSystemWithLogging.getFingerprint(UnderFileSystemWithLogging.java:565)
        at alluxio.master.file.InodeSyncStream.lambda$syncExistingInodeMetadata$2(InodeSyncStream.java:599)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  4. BaseUnderFileSystem.getFingerprint catches the exception and returns Constants.INVALID_UFS_FINGERPRINT (https://github.com/Alluxio/alluxio/blob/master/core/common/src/main/java/alluxio/underfs/BaseUnderFileSystem.java#L103)

  5. InodeSyncStream.syncExistingInodeMetadata deletes the inode of the file because of Constants.INVALID_UFS_FINGERPRINT (https://github.com/Alluxio/alluxio/blob/master/core/server/master/src/main/java/alluxio/master/file/InodeSyncStream.java#L776)

  6. DefaultFileSystemMaster.listStatus calls checkLoadMetadataOptions which throws FileDoesNotExistException (https://github.com/Alluxio/alluxio/blob/master/core/server/master/src/main/java/alluxio/master/file/DefaultFileSystemMaster.java#L1334)

secfree commented 1 year ago

could you please paste the master's DEBUG log? That would be more clear.

Tested it again and here is the debug log

2022-09-27 14:27:47,567 DEBUG UnderFileSystemWithLogging - Enter: GetFingerprint(path=hdfs://dev2/user/zhaoqun.deng/220927/file.03)
2022-09-27 14:27:47,572 INFO  RetryInvocationHandler - org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.
apache.org/sbnn-error
        at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:108)
        at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2088)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1428)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3051)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1232)
        ...
, while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over xxxx:8020. Trying to failover immediately.
2022-09-27 14:27:48,784 DEBUG BaseUnderFileSystem - Failed fingerprint. path: hdfs://dev2/user/zhaoqun.deng/220927/file.03 error: java.net.ConnectException: Call From xxxx to xxxx:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
2022-09-27 14:27:48,784 DEBUG UnderFileSystemWithLogging - Exit (OK): GetFingerprint(path=hdfs://dev2/user/zhaoqun.deng/220927/file.03) in 1217 ms

2022-09-27 14:27:48,792 WARN  FileSystemMasterClientServiceHandler - Exit (Error): ListStatus: request=path: "/dev2_mnt/zhaoqun.deng/220927/file.03"
options {
  loadMetadataType: ONCE
  commonOptions {
    syncIntervalMs: 0
    ttl: -1
    ttlAction: DELETE
  }
  recursive: false
  loadMetadataOnly: false
}
, Error=alluxio.exception.FileDoesNotExistException: Path "/dev2_mnt/zhaoqun.deng/220927/file.03" does not exist.
yyongycy commented 1 year ago

Thanks a lot for the very detail information, wondering if there is a way to pass the situation (No permission) in step 4 (https://github.com/Alluxio/alluxio/issues/16236#issuecomment-1257891644)? In that case, this information can be passed back to the up layer explicit. And upper layer can handle it more clear.

secfree commented 1 year ago

wondering if there is a way to pass the situation (No permission) in step 4

Yes, I am working on this way and will raise a PR if it works well.

secfree commented 1 year ago

Tested and confirmed that, as InodeSyncStream.syncExistingInodeMetadata may delete the inode of the file/directory because of Constants.INVALID_UFS_FINGERPRINT, if a directory has files not persisted under it, these files may lost.

yyongycy commented 1 year ago

Tested and confirmed that, as InodeSyncStream.syncExistingInodeMetadata may delete the inode of the file/directory because of Constants.INVALID_UFS_FINGERPRINT, if a directory has files not persisted under it, these files may lost.

I think this situation should be fixed. What is your fix logic?

secfree commented 1 year ago

I think this situation should be fixed. What is your fix logic?

The fix logic is #16245 , it separates FileNotFoundException and general IOException(for example, ConnectException, StandbyException), and does not delete the inode from alluxio if it is not FileNotFoundException.

yyongycy commented 1 year ago

The solution could not be just that simple, the following scenarios should be considered: (metadata update_to_data/stale, UFS reachable/not_reachable, no_sync/sync, Persist/To_be_persist),

are all the combination covered? A simple example is (stale, ufs not reachable, sync), what would be turned and how to handle the data in alluxio layer?

secfree commented 1 year ago

If UFS reachable, then the logic is not affected by the PR.

If UFS not_reachable

That is the difference of the underlying philosophy.

One side effect of the PR is that the client may get stale data if the underlying data is already updated/deleted when UFS is not reachable. I think it's better than deleting the data in alluxio directly.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.