E3SM-Project / zstash

Long term HPSS archiving tool for E3SM
BSD 3-Clause "New" or "Revised" License
8 stars 11 forks source link

`zstash` is always non-blocking #290

Open forsyth2 opened 1 year ago

forsyth2 commented 1 year ago

zstash is always non-blocking. Reported by @golaz:

Running zstash create -v --hpss=globus://nersc/home/g/golaz/2023/E3SM/fme/${EXP} --maxsize 128 . results in non-blocking archiving. This is unexpected since the --non-blocking option isn't included. That option was introduced in #214, which was before the last release (zstash v1.3.0).

forsyth2 commented 1 year ago

Debugging

Possible definitions of "non-blocking":

  1. Zstash does not wait until the transfer completes to start creating a subsequent tarball (definition from https://github.com/E3SM-Project/zstash/issues/171)
  2. Zstash submits as many zstash tarballs as have been created at the moment to be transferred (desired behavior from implementing https://github.com/E3SM-Project/zstash/issues/171. Also, it seems to me that point 1 is necessary but not sufficient for this point to be true. I.e., point 1 has to be true for point 2 to be true.)
$ git grep -n "submit_transfer"
globus.py:199:        task = transfer_client.submit_transfer(transfer_data)
globus.py:273:            last_task = transfer_client.submit_transfer(transfer_data)

So, there are two places in the code where Globus transfers are submitted.

The second instance:

The first instance:

$ git grep -n "non_blocking"
create.py:93:    globus_finalize(non_blocking=args.non_blocking)
create.py:169:    if args.non_blocking:
globus.py:264:def globus_finalize(non_blocking: bool = False):
globus.py:289:    if not non_blocking:
update.py:48:    globus_finalize(non_blocking=args.non_blocking)
forsyth2 commented 1 year ago

@lukaszlacinski Do you have any insights on this? Thank you

forsyth2 commented 1 year ago

It may make sense to address this for the next release (rather than the one currently in-progress).

From @golaz: That would give us an opportunity to rethink the --keep / --non-blocking functionality in conjunction with the three destinations (disk, HPSS, Globus). --keep is the default for some destinations but not others, and similarly for --non-blocking.

golaz commented 1 year ago

I wonder whether we even need to have blocking = non-clocking option for Globus. In the past, the main motivation was for instances where a user would be very close to their disk quota and tar files could be deleted after their transfer.

The best option might be a combination of non-blocking Globus transfers with the option of purging tar files after their successful transfer.

forsyth2 commented 1 month ago

@TonyB9000 Do you have any input on this topic? Does "The best option might be a combination of non-blocking Globus transfers with the option of purging tar files after their successful transfer." in the comment above sound like your use case for publication?

TonyB9000 commented 1 month ago

@forsyth2 I really need to better understand the "blocking vs non-blocking" behaviors (never having exercised either of them).

By my understanding, in BOTH cases, the user resides on a "source_system" with "loose" files (not zstash-format), and wants to end up with zstash-archived files on a remote (HPSS-storage) system. The documentation is honestly unclear.

I can envision 3 modes of operation:

Mode 1: Create entire finished archive locally (big local volume), then Globus transfer begins. Mode 2: Begin creating the local archive, irrespective of Globus status, with Globus transfer invoked as each file is completed.
Mode 3: Begin creating the local archive, but wait for each tar file to be Globus transferred before creating the next tar file.

Modes 1 and 2 can result in large local footprint, if Globus is slow or hangs.

I'm guessing that mode 2 is the "non-blocking", meaning that zstash tar-file creation does not block waiting for completed Globus transfers. I don't really see the value of mode 2. There may be fewer "Globus transfer jobs", but the files are generally huge so the overhead of multiple transfers seems minimal. Is the idea to save local CPU usage (all tar-files completed early, Globus catches up eventually)?

And if --keep is invoked, mode 3 is meaningless (unless Globus balks when it has to wait for a file to transfer). Just how zstash is invoking Globus is significant.

A hybrid mode might support "block at 1 TB", where zstash tarfiles are produced up to a volume of 1 TB, and then blocked until those tar-files are transferred.

Question: Globus transfer incurs its own "slowness", but then so does HPSS tape writing. Is the HPSS delay treated as part of the Globus delay? Or does Globus tell HPSS "store this" and return immediately to accept a new file transfer?