Closed adagj closed 1 year ago
There are still some unclear error for some of these realisations, to cite this one as an example:
For this one, 1850-1869 data seems OK, but then there are some errors:
cmorized number of files
variant: r6i1p1f1
Ofx, fx, etc 10
yyyy1 yyyy2 nf
1850 1859 486
1860 1869 486
1870 1879 46
1880 1889 46
1890 1899 46
1900 1909 46
...
2010 2014 46
Total: 1672
Total r6i1p1f1: 1672
So from 1870, only 46 files are able to be produced.
During Cmorization, there is error like:
--------------------------------------------------------------------------
Open MPI failed an OFI Libfabric library call (fi_endpoint). This is highly
unusual; your job may behave unpredictably (and/or abort) after this.
Local host: ipcc
Location: mtl_ofi_component.c:513
Error: Invalid argument (22)
--------------------------------------------------------------------------
ipcc:rank0: PSM3 can't open nic unit: 0 (err=23)
ipcc:rank0: PSM3 can't open nic unit: 0 (err=23)
ipcc:rank0: PSM3 can't open nic unit: 0 (err=23)
…
ipcc:rank7: PSM3 can't open nic unit: 0 (err=23)
ipcc:rank6: PSM3 can't open nic unit: 0 (err=23)
[ipcc:2062563] 7 more processes have sent help message help-mtl-ofi.txt / OFI call fail
[ipcc:2062563] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
The cmorization tool is identical and goes generally well with some other realisations. I suspect there are some errors in the file / file system.
Do you have any ideas? I would suspect this is due to some file/file system error?
I can ask Sigma2 if they can quickly spot some error and get some clue?
I'll have a look and see if I find something
Hm, I can't find any problems with the files. I can open all, display them and I'm not sure what else to check... maybe you can ask sigma2?
I'm not sure if this is relevant, but a similar problem was discussed here: https://github.com/ofiwg/libfabric/issues/6710 sorry if it is something completely different...
OK, I can ask Sigma2 to see if they will have some clue. Otherwise, I will dig further into it.
@YanchunHe Good if you can prioritize the SSP simulations. Are those also problematic?
Yes, OK. most of the esm-hist simulations are done, then I will soon go with the ssp simulations.
Supert. Thanks!
CMORized, and ready to be published to ESGF.
486 datasets for every 10 years.
data path
version
sha256sum /projects/NS9034K/CMIP6/CMIP/NCC/NorESM2-LM/esm-hist
@YanchunHe published
Mandatory information:
Full path to the case(s) of the experiment on NIRD /projects/projects/NS10013K/noresm/cases/
experiment_id esm-hist
model_id NorESM2-LM
CASENAME(s) and years to be CMORized NHIST_2201_f19_tn14_20230201esm, 1850-2014
Optional information
6th realization, r6i1p1f1,
parent_experiment_id esm-piControl
parent_experiment_rip r1i1p1f1
parent_time_units parent_time_units = "days since 1851-01-01"
branch_method 'Hybrid-restart from year 2201-01-01 of esm-piControl',
other information (provide other information that might be useful) 6th realization, hybrid restart from /projects/NS9560K/noresm/cases/N1850_f19_tn14_20190730esm at year 2201-01-01