Closed AndyHoggANU closed 2 years ago
PS. This is using the old mppnccombine
-- because we think mppnccombine-fast
doesn't cope well with the regional outputs, because it misses the coordinates of masked tiles.
I thought the issue was that the output of mppnccombine-fast
would have one chunk per core. But maybe we could re-chunk it to make it useable.
I'd suggest changing that payu code to something like
tile_fnames = [f for f in glob(os.path.join(dir, '*.nc.*'))
if f.split('.')[-1].isdigit() and f.split('.')[-2] == 'nc']
this would also need from glob import glob
There are sorting issues as well once the zero-padding runs out. I decided to use pathlib
because it is cleaner. Created some tests to check it is doing the right thing too.
Clearly I need to read more carefully. Those suffixes are zero-padded.
I have pushed a new tag (1.0.22
) which should show up in conda/analysis3-unstable
in 30-40 minutes all things being equal
ping @AndyHoggANU
Ping yourself ... I gave this a try. Am using conda/analysis3-unstable
but I find:
laboratory path: /scratch/v45/amh157/access-om2
binary path: /scratch/v45/amh157/access-om2/bin
input path: /scratch/v45/amh157/access-om2/input
work path: /scratch/v45/amh157/access-om2/work
archive path: /scratch/v45/amh157/access-om2/archive
{'/scratch/v45/amh157/access-om2/archive/01deg_jra55v140_iaf_cycle3_HF/output704/ocean': []}
27763778.gadi-pbs
payu: Found modules in /opt/Modules/v4.3.0
======================================================================================
Resource Usage on 2021-09-03 19:01:39:
Job Id: 27763771.gadi-pbs
Project: x77
Exit Status: 0
Service Units: 0.08
NCPUs Requested: 4 NCPUs Used: 4
CPU Time Used: 00:00:02
Memory Requested: 192.0GB Memory Used: 159.64MB
Walltime requested: 10:00:00 Walltime Used: 00:00:03
JobFS requested: 100.0MB JobFS used: 0B
======================================================================================
Maybe it just hasn't gone through yet?
Yeah, the conda update errored, still the previous version.
$ conda list payu
# packages in environment at /g/data3/hh5/public/apps/miniconda3/envs/analysis3-21.07:
#
# Name Version Build Channel
payu 1.0.21 py_0 coecms
Sorry, the conda install is broken. I tried a quick fix, but didn't work. Will have to wait until Monday I am afraid.
Or you can load the conda/python3
environment then try pip install --user
directly from GitHub, or clone payu and pip install . --user
and then use ~/.local/bin/payu
@AndyHoggANU Fixed the conda update, give it a crack
Didn't realise it was Monday already. ;-)
Anyway, I tried this -- conda has indeed updated but I get this error:
[amh157@gadi-login-02 01deg_jra55v140_iaf_cycle3_HF]$ more 01deg_jra55_i_c.e27781306
Traceback (most recent call last):
File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-21.07/bin/payu-collate", line 10, in <modu
le>
sys.exit(runscript())
File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/payu/subc
ommands/collate_cmd.py", line 111, in runscript
expt.collate()
File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/payu/expe
riment.py", line 814, in collate
model.collate()
File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/payu/mode
ls/fms.py", line 143, in collate
fnames = Fms.get_uncollated_files(self.output_path)
File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/payu/mode
ls/fms.py", line 66, in get_uncollated_files
tile_fnames = [f for f in Path(dir).iterdir()
File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/payu/mode
ls/fms.py", line 68, in <listcomp>
f.suffixes[1][1:].isdigit()]
IndexError: list index out of range
Not sure which list index it is referring to.
@AndyHoggANU can you try again
Yep, trying now -- will keep you posted.
BTW, it appears to be working ... but very slowly. I think this is characteristic of combing from regional diagnostics, so will just let it run its course.
It may be faster to use mppnccombine-fast
with an option which will force it to recompress the data, which overcomes the chunking issue. e.g. -d 4
I have an ACCESS-OM2-01 simulation where I am trying to save some regional diagnostics. The simulation uses Andrew’s 10461 core count for MOM, meaning that the regional diagnostics routine (which writes out 1 netcdf tile per core) now has 6 digits in the filename after the .nc — like
rregionocean-2d30m-vorticity_z-3-hourly-mean-ym_2012_01.nc.010431
.It seems that payu doesn't ask mppnccombine to collate these files, likely because of this: https://github.com/payu-org/payu/blob/9348acdf92ca18aae229fc06b0b716d4cd85e1aa/payu/models/fms.py#L65-L66
Is there a nice way to generalise this bit of code?