LLNL / UnifyFS

UnifyFS: A file system for burst buffers
Other
103 stars 31 forks source link

Improve support for file staging #717

Closed MichaelBrim closed 2 years ago

MichaelBrim commented 2 years ago

Description

Various fixes and improvements to our support for serial and parallel stage-in/out. Primarily, this reimplements unifyfs-stage helper program to use the library API rather than wrapped POSIX I/O. A new unifyfs_transfer_result structure has been added that provides additional information including the transferred file size in bytes and the transfer time in seconds. For stage-in, there is initial support for specifying the data distribution to use - 'balanced' placement evenly divides the file data across servers in 16MiB transfer chunks, while 'skewed' would allow for uneven data placement. Currently, only 'balanced' placement is supported for stage-in.

A summary of changes to various components follows:

Client library:

Library API:

Examples & Tests:

unifyfsd Server:

unifyfs-stage helper program:

Motivation and Context

Addresses issue #686 and other user-reported problems with file staging support.

How Has This Been Tested?

Tested in serial and parallel transfer modes using OLCF Summit on up to 64 nodes, with a wide range of manifest files for stage-in/out. The manifest files contained up to 32 files and a wide variety of file sizes.

Types of changes

Checklist:

MichaelBrim commented 2 years ago

Still finishing up my testing on Summit. Will remove draft status once that's done.

adammoody commented 2 years ago

Thanks a ton, @MichaelBrim !