Closed adammoody closed 9 months ago
@daltonbohning , I hope you're doing well.
We've had some requests to add O_NOATIME
to some tools. I've opened this PR to do that. It'd be good to know whether these changes are valid for DAOS system.
Is there someone who can help us check that?
@daltonbohning , I hope you're doing well.
We've had some requests to add
O_NOATIME
to some tools. I've opened this PR to do that. It'd be good to know whether these changes are valid for DAOS system.Is there someone who can help us check that?
Hey Adam. Yes, I'm doing well. Hope the same for you!
It looks like DAOS/DFS doesn't respect passing O_NOATIME
, though atime is updated. I tested and it doesn't break anything. Extra bits set in flags
are just ignored. So this change is safe for us, and I'll create an internal ticket to see if we want to handle that flag when passed.
Thanks for the heads up!
DAOS JIRA for reference: https://daosio.atlassian.net/browse/DAOS-14479
We discussed this in the context of DAOS, and we don't actually store atime with the file. It's only populated in the stat buf to the greater of mtime or ctime. So handling O_NOATIME wouldn't help anything because it would just get "reset" on the next file open.
TODO: we'll need to be a bit more clever when copying files that are readable but not owned by the user.
From man 2 open
:
O_NOATIME (since Linux 2.6.8) Do not update the file last access time (st_atime in the inode) when the file is read(2).
This flag can be employed only if one of the following conditions is true:
- The effective UID of the process matches the owner UID of the file.
The calling process has the CAP_FOWNER capability in its user namespace and the owner UID of the file has a mapping in the namespace.
This flag is intended for use by indexing or backup programs, where its use can significantly reduce the amount of disk activity. This flag may not be effective on all filesystems. One example is NFS, where the server maintains the access time.
and potential error:
EPERM The O_NOATIME flag was specified, but the effective user ID of the caller did not match the owner of the file and the caller was not privileged.
We discussed this in the context of DAOS, and we don't actually store atime with the file. It's only populated in the stat buf to the greater of mtime or ctime. So handling O_NOATIME wouldn't help anything because it would just get "reset" on the next file open.
Thanks, @daltonbohning . And thanks for your super fast response!
It sounds like tar
updates source file atimes by default but one can attempt to preserve atime with an option:
https://www.gnu.org/software/tar/manual/html_section/Attributes.html
When tar reads files, it updates their access times. To avoid this, use the ‘--atime-preserve[=METHOD]’ option, which can either reset the access time retroactively or avoid changing it in the first place.
For the ability to use O_NOATIME
, we do a similar check in mfu_flist_chmod()
:
TODO: this code doesn't really accomplish what it claims to do. I'll fix that later.
Apparently, rsync v3.2.0 provides the following options for atime:
https://download.samba.org/pub/rsync/rsync.1
--atimes, -U
This tells rsync to set the access (use) times of the destina‐
tion files to the same value as the source files.
If repeated, it also sets the --open-noatime option, which can
help you to make the sending and receiving systems have the same
access times on the transferred files without needing to run
rsync an extra time after a file is transferred.
Note that some older rsync versions (prior to 3.2.0) may have
been built with a pre-release --atimes patch that does not imply
--open-noatime when this option is repeated.
--open-noatime
This tells rsync to open files with the O_NOATIME flag (on sys‐
tems that support it) to avoid changing the access time of the
files that are being transferred. If your OS does not support
the O_NOATIME flag then rsync will silently ignore this option.
Note also that some filesystems are mounted to avoid updating
the atime on read access even without the O_NOATIME flag being
set.
Tip from: https://unix.stackexchange.com/questions/630228/rsync-keep-access-time-atime-how
This adds an
--open-noatime
option to a number of tools, which adds theO_NOATIME
flag when opening files to avoid updating the file last access time.Many centers use last access time to filter files for purge operations, and they would prefer not to change file atime values when making backup copies with
dsync
or scanning the file system for duplicate files withddup
. Adding this flag may also improve read performance on some file systems.The
O_NOATIME
flag is only allowed when the effective user id matches the owner of the file or when the process is running with theCAP_FOWNER
capability. A normal user will encounter errors when usingO_NOATIME
when reading from a shared directory containing files owned by other users, even if the current user has read access to all files.The following tools are affected:
ddup
- when reading files to compute hash valuesdcp
anddsync
- when reading source files during a copydcmp
anddsync
- when reading source and destination files while comparing their contentsdtar
- while reading source files when creating an archiveWheen
--open-noatime
is specified withddup
, the tool checks the owner user id of each file and conditionally addsO_NOATIME
if the process effective user id matches. This allows normal users to specify the--open-noatime
option, even when runningddup
on files that they don't own. The atime will be updated on files that the user can read but does not own.For the remaining tools, the current algorithms do not expose the file owner id in a way to allow for an easy check. In this case,
O_NOATIME
is added when opening all files. Normal users will thus encounter an error if the tool attempts to open any file that they do not own.Resolves: https://github.com/hpc/mpifileutils/issues/557 https://github.com/hpc/mpifileutils/pull/534