mar-file-system / marfs

MarFS provides a scalable near-POSIX file system by using one or more POSIX file systems as a scalable metadata component and one or more data stores (object, file, etc) as a scalable data component.
Other
96 stars 27 forks source link

failures due to non-existent chunks are sometimes hidden #156

Open jti-lanl opened 8 years ago

jti-lanl commented 8 years ago

After fixing #154, there is currently no known way to create this situation through MarFS, but it could presumably be reproduced by manually deleting e.g. the second chunk in a Multi.

During the transition in reading from the first to second chunk, the attached log shows that the second GET fails, but that the failure is masked in stream_wait or stream_sync. (Log output was edited to obscure hosts, users, etc.)

dd reports success, even for long sequences covering multiple (non-existent chunks). The marfs file in question was written with all zeros, and the dd gets zeros back, presumably because somebody's read-buffers are zero'ed out.

Note: this situation can not crop up from a failed write to a MarFS file. That would be an "incomplete" file, marked with a RESTART xattr, and open() would always fail. What is shown in the log snippet is the read from a file successfully written before #154 was fixed. But, as noted above, something like this might happen if a chunk-object were destroyed.

issue-154.log.txt