NorESMhub / noresm2cmor

A command line tool for cmorizing NorESM output
http://noresmhub.github.io/noresm2cmor/
5 stars 16 forks source link

[CMIP6 CMOR-ization & ESGF-publication] NorESM2-LM - esm-hist #331

Closed adagj closed 1 year ago

adagj commented 1 year ago

Mandatory information:

Full path to the case(s) of the experiment on NIRD /projects/projects/NS10013K/noresm/cases/

experiment_id esm-hist

model_id NorESM2-LM

CASENAME(s) and years to be CMORized NHIST_2201_f19_tn14_20230201esm, 1850-2014

Optional information

6th realization, r6i1p1f1,

parent_experiment_id esm-piControl

parent_experiment_rip r1i1p1f1

parent_time_units parent_time_units = "days since 1851-01-01"

branch_method 'Hybrid-restart from year 2201-01-01 of esm-piControl',

other information (provide other information that might be useful) 6th realization, hybrid restart from /projects/NS9560K/noresm/cases/N1850_f19_tn14_20190730esm at year 2201-01-01

YanchunHe commented 1 year ago

There are still some unclear error for some of these realisations, to cite this one as an example:

For this one, 1850-1869 data seems OK, but then there are some errors:

cmorized number of files

variant: r6i1p1f1
Ofx, fx, etc    10
yyyy1   yyyy2   nf
1850    1859    486
1860    1869    486
1870    1879    46
1880    1889    46
1890    1899    46
1900    1909    46
...
2010    2014    46
Total:      1672
Total r6i1p1f1: 1672

So from 1870, only 46 files are able to be produced.

During Cmorization, there is error like:

--------------------------------------------------------------------------
Open MPI failed an OFI Libfabric library call (fi_endpoint).  This is highly
unusual; your job may behave unpredictably (and/or abort) after this.

  Local host: ipcc
  Location: mtl_ofi_component.c:513
  Error: Invalid argument (22)
--------------------------------------------------------------------------
ipcc:rank0: PSM3 can't open nic unit: 0 (err=23)
ipcc:rank0: PSM3 can't open nic unit: 0 (err=23)
ipcc:rank0: PSM3 can't open nic unit: 0 (err=23)
…
ipcc:rank7: PSM3 can't open nic unit: 0 (err=23)
ipcc:rank6: PSM3 can't open nic unit: 0 (err=23)
[ipcc:2062563] 7 more processes have sent help message help-mtl-ofi.txt / OFI call fail
[ipcc:2062563] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

The cmorization tool is identical and goes generally well with some other realisations. I suspect there are some errors in the file / file system.

Do you have any ideas? I would suspect this is due to some file/file system error?

I can ask Sigma2 if they can quickly spot some error and get some clue?

adagj commented 1 year ago

I'll have a look and see if I find something

adagj commented 1 year ago

Hm, I can't find any problems with the files. I can open all, display them and I'm not sure what else to check... maybe you can ask sigma2?

I'm not sure if this is relevant, but a similar problem was discussed here: https://github.com/ofiwg/libfabric/issues/6710 sorry if it is something completely different...

YanchunHe commented 1 year ago

OK, I can ask Sigma2 to see if they will have some clue. Otherwise, I will dig further into it.

adagj commented 1 year ago

@YanchunHe Good if you can prioritize the SSP simulations. Are those also problematic?

YanchunHe commented 1 year ago

Yes, OK. most of the esm-hist simulations are done, then I will soon go with the ssp simulations.

adagj commented 1 year ago

Supert. Thanks!

YanchunHe commented 1 year ago

CMORized, and ready to be published to ESGF.

486 datasets for every 10 years.

data path

version

sha256sum /projects/NS9034K/CMIP6/CMIP/NCC/NorESM2-LM/esm-hist

monsieuralok commented 1 year ago

@YanchunHe published