Closed tangq closed 1 year ago
Yes, that all sounds exactly right.
I would just add:
/lcrc/group/e3sm/public_html/diagnostics/observations/
are group read and write, in case someone needs to modify the files for some reason.I will also note that the files are available for public download here: https://web.lcrc.anl.gov/public/e3sm/diagnostics/observations/ So it is very important that you have permission to distribute the observations and that you include any license information that is required along with them.
If you have observations you wish to use internally in E3SM but are not willing or able to distribute publicly, you can put them in:
/lcrc/group/e3sm/diagnostics_private/observations
These also get synced with mache
and again you should make them group readable and writable but they will not be available on the web server.
@xylar , can mache
handle symbolic links? We'd like to use the links to point to different versions of input files without changing the source code.
@tangq, I think it will copy the file twice rather than copying the symlink. So if I'm right about that, you will only save disk space by symlinking on Chrysalis and Anvil. On other machines, you will have redundant copies. But that's not a huge problem unless you're talking about very large files.
@hsiangheleellnl and I just discussed about it. The main benefit of symlinks is that we can use the same input file names in the python scripts, so we don't need to update the code when updating the input data.
Yep, that's fine. I will also check if symlinks get preserved by my rsync
commands between machines in case what I said above is not correct.
@hsiangheleellnl will upload the input files (with time stamps in the file names) to the LCRC data server and create symlinks there.
It sounds like the input data are rsynced by your script to E3SM machines. We can test that when @hsiangheleellnl uploaded the files.
@tangq, I looked at the mache
code and it seems like I'm using the --link
flag for rsync
, which should keep symlinks the same on all the machines as they are on LCRC. So make sure they're relative-path links within diagnostics
and they should work elsewhere.
@hsiangheleellnl uploaded the input files to the input data server at /lcrc/group/e3sm/diagnostics_private/observations/Atm/ChemDyg_inputs
@xylar I have a question about how to reset 'diagnostics_base_path'. The current setup is to indicate the path /lcrc/group/e3sm/diagnostics/observations/Atm, but we want to put the input data in the 'diagnostics_private'. How can I reset the link?
@hsiangheleellnl, that's a great question that has a bit of a complicated answer.
On LCRC there are 3 diagnostics directories:
/lcrc/group/e3sm/public_html/diagnostics/
/lcrc/group/e3sm/diagnostics_private/
/lcrc/group/e3sm/diagnostics/
The mache sync diags
tool (part of E3SM-Unified from the mache
package) us used to copy the diagnostics data from the first 2 directories into the 3rd one. This seems strange on LCRC but it's the equivalent procedure locally to what happens on all the other E3SM supported machines: we combine the public and private data so they're all in one place.
I will go ahead and run the mache sync diags
tool on LCRC and your data should end up in the expected place. Are there other machines where you need the data right now as well? If so I'll sync those. Otherwise, I would wait a bit because I have other work that will require syncing that is waiting in the wings.
@tangq and @hsiangheleellnl, the diagnostics that you placed in diagnostics_private
should now be synced to diagnostics
. Let me know if you have any trouble.
Thank you, @xylar , for the elaborate reply. Now I have a better idea of the mache sync diags
logic. I can see the data synced to diagnostics
on LCRC. Can you run it for compy, where we may run chemistry tests due to the limited chrysalis scratch space?
I noticed that the diagnostics_private
directory cannot be accessed from blues. I guess that's intentional.
I noticed that the diagnostics_private directory cannot be accessed from blues. I guess that's intentional.
I just logged onto blues and I was able to access it just fine:
$ pwd
/lcrc/group/e3sm/diagnostics_private
$ ls -lah
total 34K
drwxrws--- 3 ac.xasay-davis E3SM 4.0K Nov 15 04:00 .
drwxrwsr-x+ 200 root E3SM 16K Mar 23 10:50 ..
drwxrws--- 4 ac.xasay-davis E3SM 4.0K Mar 16 12:35 observations
Access is the same from Anvil (Blues) and Chrysalis as far as I know. Could you check again?
Can you run it for compy, where we may run chemistry tests due to the limited chrysalis scratch space?
Sure, I synced to Compy.
I noticed that the diagnostics_private directory cannot be accessed from blues. I guess that's intentional.
I just logged onto blues and I was able to access it just fine:
$ pwd /lcrc/group/e3sm/diagnostics_private $ ls -lah total 34K drwxrws--- 3 ac.xasay-davis E3SM 4.0K Nov 15 04:00 . drwxrwsr-x+ 200 root E3SM 16K Mar 23 10:50 .. drwxrws--- 4 ac.xasay-davis E3SM 4.0K Mar 16 12:35 observations
Access is the same from Anvil (Blues) and Chrysalis as far as I know. Could you check again?
I can access diagnostics_private from Blues now.
If you need more scratch space on Chrysalis, use /gpfs/fs0/globalscratch
This was discussed at the infrastructure group meeting and documented at this meeting notes page.
The suggestions are:
/lcrc/group/acme/public_html/diagnostics/observations/
mache
to all machines.