Open nsc-jens opened 3 years ago
Hi Jens,
Thanks for reporting this.
Yes, the underlying problem here is a mismatch between the NFS protocol semantics and how dCache operates.
Currently, dCache doors require that all files have a replica on some pool, even for zero-length files. Doors also don't allow read access to files until the initial upload is complete.
However, NFS protocol has two different operations: one that creates the namespace entry and one that opens a file for writing. The touch
command results in the client issuing the NFS request that creates the namespace entry, but doesn't open the file. This is indistinguishable from a client that would like to write a non-zero data, but just hasn't opened the file so far.
On your suggestions:
Yes, we could optimise doors so that they handle zero-length files without redirecting; however, the problem here is subtle different. dCache believes the file is still being uploaded, so we would either need to add read support for incomplete files, or update NFS to somehow identify files that should be zero-length.
Registering a file in Chimera only after completion would be possible, but would bring a number of disadvantages with how errors are handled and how dCache behaves if multiple clients attempting to upload the same file. It would also prevent clients from deleting files that are in the process of being uploaded -- which some use-cases rely on.
Hi!
This is a continuation of RT ticket 10082
In dCache a zero length file have a metadata entry. This can point to a file location on a pool with a zero length file or it can just be the metadataentry with no file location. The previous can be created when you upload an empty file over webdav and latter can be created with "touch foo" in an NFS mounted file system (there are plenty of other ways).
In NFS these files can be read with the expected results:
Over other protocols this results in weird and bad behaviour:
As I understand this is due to the fact that the doors cannot know if the file is actually empty or that it is still being uploaded. In either case this cause problems for the users trying to use rclone, a browser or WinSCP to download data that contains empty files with no file location. I have plenty of those.
Suggestions:
Either make the doors aware of the the "open" state of files still being uploaded and "closed and immutable" files where transfers are done.
If the file size is empty in Chimera, why bother with a redirect in case of the WebDAV door? Just hand the client zero bytes of data and finish the transfer.
A file could optionally be registered in Chimera when the upload is completed and not initiated. This would also remove a lot of corner cases where we today get empty files when a transfer is aborted for different reasons. This was a huge problem with NFS back in the days but I think a number of issues has been solved since then. I guess this would break POSIX though (lockfiles woundn't work).