Open jti-lanl opened 8 years ago
Related to #96. Should we merge?
Open needs to put xattrs on any newly created file.
Well, currently open does put xattrs on new files, but I've learned that it didn't have to. We should fix open-for-write to use whatever the repo says. We only keep DIRECT files for reading. Then mknod() no longer has to communicate with open().
On Jul 11, 2016, at 3:38 PM, Brett Kettering notifications@github.com wrote:
Related to #96. Should we merge?
Open needs to put xattrs on any newly created file.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
This was item "(3b)" in issue #83. Moved here, so we can close that.
For MD insert performance tests, with zero-length files, the simple approach to avoiding the cost of writing xattrs would be to use a DIRECT repo, because then you definitely won't be writing xattrs. If you wanted, you could have a namespace that uses DIRECT for size=0 only, like so:
Then, MD insert testing via pftool would use a synthetic source which creates source-filenames (and reports length 0 for them), and libmarfs would do the open() with a DIRECT repo, and no xattrs would be written.
However, on a non-DIRECT repo, just skipping writing the xattrs seems like trouble. When we later stat that file, the lack of xattrs implies that it is DIRECT. Therefore, a writer overwriting the file would simply overwrite it in place. It doesn't know any better.
Issue #96 would address some of this. But not all apps have easy access to the namespace. An inode-scan can't tell that this file is not DIRECT. That means the file will not be considered by the quota-update tool, or the packer. It would make no contribution to used-storage, but it does count as an inode. Therefore, a malicious user would be unconstrained in the number of empty files s/he could create. [Created issue #97 to address this.]
The packer has nothing to gain from packing zero-length files. They will take up inode-space, whether packed or not, and they would only clog up the packed file with recovery-info. I believe we are willing to lose zero-length files in the event of a complete loss of MDFS, with recovery from recovery-info in objects. They would add complexity to the packer's task (no object-ID). So, perhaps it would be okay for the packer to ignore them.