Open kc9jud opened 4 years ago
According to the NERSC HPSS documentation (https://docs.nersc.gov/filesystems/archive/#avoid-very-large-files), files over 2TB are inefficient when put to HPSS. They recommend breaking files up into 500GB chunks if they get over that limit.
The hsi handler mcscript.task.archive_handler_hsi() should:
mcscript.task.archive_handler_hsi()
hsi
split
In addition (so that consumers of archives don't need to be aware of this splitting behavior):
mcscript
c849af4887cd94fe9141cbdd11eac50ba3a8799a implements the logic in archive_handler_hsi()
archive_handler_hsi()
According to the NERSC HPSS documentation (https://docs.nersc.gov/filesystems/archive/#avoid-very-large-files), files over 2TB are inefficient when put to HPSS. They recommend breaking files up into 500GB chunks if they get over that limit.
The hsi handler
mcscript.task.archive_handler_hsi()
should:hsi
if smaller than threshold, orsplit
to break up the file into smaller segments and put those withhsi
.In addition (so that consumers of archives don't need to be aware of this splitting behavior):
mcscript
should also provide a wrapper function to fetch and reassemble archives,