Open tangq opened 2 years ago
@tangq I'm not sure why they disappeared on a second try. The Exception
notes that you might not have been logged into hsi
the first time, but that's certainly not the only possible reason that that tar failed to transfer.
I had another check job without specifying the tar file names failed with similar errors. So, the more flexible input file names options you implemented are really helpful when checking the large simulations.
My question is with the successful second check, is it safe to say that these tar files are correctly archived?
So, the more flexible input file names options you implemented are really helpful when checking the large simulations.
Great, glad to hear #170 is working well.
My question is with the successful second check, is it safe to say that these tar files are correctly archived?
If you have a log file from the check, you can run grep -i Exception <log_file>
to double check for any errors.
The log file for the second try (only checking the 3 failed files) is at cori:/global/cscratch1/sd/tang30/E3SMv2/v2.NARRM.piControl/check2/out. Nothing returns from grep -i Exception
.
The log file for the first try is: /global/cscratch1/sd/tang30/E3SMv2/v2.NARRM.piControl/zstash_check_20211209.log. It is still ongoing and returns the 3 files when grepping exception.
Good to know the "exception" key word - less messages than "error".
Good to know the "exception" key word
Great, I'm planning to include a note about that with #168.
I have also seen errors like this before. If the error is not reproducible, it is very likely that it was caused by some intermittent hsi issue or unavailability.
Also, I have found that hsi errors are more likely when retrieving to CSCRATCH. I now use cfs to run zstash check and it seems more reliable.
I have also run into This command includes hsi. Be sure that you have logged into hsi
intermittently. I can connect to hsi. I'm wondering whether anyone has figured out a solution?
I can connect to hsi. I'm wondering whether anyone has figured out a solution?
@wagmanbe #314 -- that error message offers one possible solution (probably the most common), but there may be other things wrong.
Thank you. It's helpful to know that there is not one obvious cause. I'll keep troubleshooting.
And yes, I can log into hsi manually, e.g. > hsi
I encountered the errors below for a few files when specifying the tar file names.
When I retried it, they were checked successfully. What these errors mean? Why they disappear when trying again? Thanks.