hpc / mpifileutils

File utilities designed for scalability and performance.
https://hpc.github.io/mpifileutils
BSD 3-Clause "New" or "Revised" License
162 stars 64 forks source link

Add option to ignore hardlinks in dsync, dcmp and dwalk #565

Open rezib opened 8 months ago

rezib commented 8 months ago

This is a proposal to add -H, --nohardlink option on dsync, dcmp and dwalk to ignore hardlinks when walking in files tree.

The rationale is to avoid producing multiple copies of the same inodes which could result in synchronized files tree requiring much more storage consumption.

The corresponding manpages are also updated accordingly.

adilger commented 8 months ago

Not that if the source filesystem is Lustre and the client is mounted with user_fid2path or as root, you can use lfs path2fid --parents (or llapi_path2parent() equivalent) to generate a list of up to 100 parent directory FIDs for a hard linked file, and/or lfs fid2path (or llapi_fid2path() equivalent) to generate the pathnames for the hard links to a file.

This would allow efficiently maintaining the hard links in the target filesystem without having to make a full separate copy of the file, or scan the source tree trying to find the links.

Alternately, I believe tar will keep an in-memory list of inode numbers with hard links and if they are encountered again during tree traversal it will store a hard link instead of the full file.

cedeyn commented 8 months ago

Not that if the source filesystem is Lustre and the client is mounted with user_fid2path or as root, you can use lfs path2fid --parents (or llapi_path2parent() equivalent) to generate a list of up to 100 parent directory FIDs for a hard linked file, and/or lfs fid2path (or llapi_fid2path() equivalent) to generate the pathnames for the hard links to a file.

This would allow efficiently maintaining the hard links in the target filesystem without having to make a full separate copy of the file, or scan the source tree trying to find the links.

Alternately, I believe tar will keep an in-memory list of inode numbers with hard links and if they are encountered again during tree traversal it will store a hard link instead of the full file.

Hi @adilger , This patch came from CEA with the Lustre filesystem. I'am totally agree, that's what we did, but we also need to exclude hardlinks to make a full copy of the filesystem without hardlinks and then apply your method. This is a simple patch, the hardway would be to keep track of each inode in the mpifileutils tools and compare if we already copied this inode or not. If it's already present, make a hardlink, else make a copy.

adammoody commented 8 months ago

Thanks for the patch @rezib , and thanks for the tip on the Lustre calls for hardlinks @adilger .

Yes, this looks simple enough to add, and I understand the need.

We could also look to add hardlink support for the copy. I need to think about this more, but I suspect we could support this across file systems in general by using DTCMP_Rankv(). For each file that has hardlinks (st.st_nlink > 1), I think during the walk we could add the path to a local list on each process as a (inode, path) pair. After the walk completes, we could then identify all paths that map to the same inode value using DTCMP_Rankv where we use inode as the sort key. The process that has rank 0 for a particular inode value after that operation would be responsible for copying the file, while those entries that are assigned rank > 0 could create hardlinks.

We could perhaps use the Lustre calls as an optimization.

This needs to be fleshed out more...

Assuming we do that, would this option still be useful in other cases?