Closed treerink closed 3 years ago
As a start @goord will investigate in which way such an interface can be created. So in a branch we will just try to create this kind of infrastructure and as a test we take a little script which just does the same cdo as which is used currently.
Hi, as noted in the other script, the external script computes:
The variables in table 126 (from 120 to 132) are tendencies, and as such they should be in ACCUMFLD in grib_codes.json (I guess)
Another point is that in cdoapi.py I don't see: zonmean, enlarge, mul, sub, griddes (with txt in output).
nco
is not available in the ece2cmor3 environment right?
@fserva we have been looking into one of the examples: EmonZ
utendepfd
and as you can see the data request asks this variable on 39 pressure levels (plev39).
The plev39
coordinate from the table looks like:
"plev39": {
"standard_name": "air_pressure",
"units": "Pa",
"axis": "Z",
"long_name": "pressure",
"climatology": "",
"formula": "",
"must_have_bounds": "no",
"out_name": "plev",
"positive": "down",
"requested": [
"100000.",
"92500.",
"85000.",
"70000.",
"60000.",
"50000.",
"40000.",
"30000.",
"25000.",
"20000.",
"17000.",
"15000.",
"13000.",
"11500.",
"10000.",
"9000.",
"8000.",
"7000.",
"5000.",
"3000.",
"2000.",
"1500.",
"1000.",
"700.",
"500.",
"300.",
"200.",
"150.",
"100.",
"70.",
"50.",
"40.",
"30.",
"20.",
"15.",
"10.",
"7.",
"5.",
"3."
],
"requested_bounds": "",
"stored_direction": "decreasing",
"tolerance": "",
"type": "double",
"valid_max": "",
"valid_min": "",
"value": "",
"z_bounds_factors": "",
"z_factors": "",
"bounds_values": "",
"generic_level_name": ""
},
So what is actually the reason that you request this variable on model levels and not directly on the requested pressure levels? If I would ask this variable to a DynvarMIP request and with that run our genecec
tool, that is what would happen, the (missing) plev39
levels will be added to the ppt files.
Hi, the 8 variables mentioned here are not available as model output. They need to be computed offline, but this requires computing vertical derivatives. Since increased vertical resolution reduces the numerical error, all 91 levels are used for the computation. Then subsampling of these variables needs to be done. I understand genecec
would not be needed in this case right?
The variables defined in table 126 are instead requested on the reduced set of pressure levels you are referring to.
Hello, I just noticed something with the units of the wind tendencies that should be computed offline.
In https://github.com/PCMDI/cmip6-cmor-tables/blob/master/Tables/CMIP6_EdayZ.json, for example utendvtem, it reports:
"utendvtem": {
"frequency": "day",
"modeling_realm": "atmos",
"standard_name": "tendency_of_eastward_wind_due_to_advection_by_northward_transformed_eulerian_mean_air_velocity",
"units": "m s-1 d-1",
"cell_methods": "longitude: mean time: mean",
"cell_measures": "",
"long_name": "Tendency of Eastward Wind Due to TEM Northward Advection and Coriolis Term",
"comment": "Tendency of zonally averaged eastward wind, by the residual northward wind advection (on the native model grid). Reference: Andrews et al (1987): Middle Atmospheric Dynamics. Academic Press.",
"dimensions": "latitude plev39 time",
"out_name": "utendvtem",
"type": "real",
"positive": "",
"valid_min": "",
"valid_max": "",
"ok_min_mean_abs": "",
"ok_max_mean_abs": ""
},
and same here http://clipc-services.ceda.ac.uk/dreq/01.00.21/u/DynVar.html.
However, the reference paper for this MIP asks for units m s-2, as in most cases I suppose.
The difference is the factor 86400. I've checked a couple of models and they have units of m/s/d.
What should have the precedence for this? @treerink @goord
I have merged the master into the dynvar-tem-script
branch and had to resolve two conflicts: one in the ifspar.json
and one in the the taskloader.py
.
@goord can you review the merged code below in the taskloader.py
:
model_vars = load_model_vars()
load_masks(model_vars)
load_scripts(model_vars)
masks = load_masks(load_model_vars())
return create_tasks(filtered_matches, get_models(active_components), masks=masks)
I wondered whether the recently introduced masks (for the automatic masked regridding for masked fields) needs any further connection with the load scripts. I actually thought this is disconnected as this part is fully covered by the called script in that case, but the added 'load_masks(model_vars)' in this branch did confuse me a bit. So asking for a check.
Hello, I just noticed something with the units of the wind tendencies that should be computed offline.
In https://github.com/PCMDI/cmip6-cmor-tables/blob/master/Tables/CMIP6_EdayZ.json, for example utendvtem, it reports:
"utendvtem": { "frequency": "day", "modeling_realm": "atmos", "standard_name": "tendency_of_eastward_wind_due_to_advection_by_northward_transformed_eulerian_mean_air_velocity", "units": "m s-1 d-1", "cell_methods": "longitude: mean time: mean", "cell_measures": "", "long_name": "Tendency of Eastward Wind Due to TEM Northward Advection and Coriolis Term", "comment": "Tendency of zonally averaged eastward wind, by the residual northward wind advection (on the native model grid). Reference: Andrews et al (1987): Middle Atmospheric Dynamics. Academic Press.", "dimensions": "latitude plev39 time", "out_name": "utendvtem", "type": "real", "positive": "", "valid_min": "", "valid_max": "", "ok_min_mean_abs": "", "ok_max_mean_abs": "" },
and same here http://clipc-services.ceda.ac.uk/dreq/01.00.21/u/DynVar.html.
However, the reference paper for this MIP asks for units m s-2, as in most cases I suppose.
The difference is the factor 86400. I've checked a couple of models and they have units of m/s/d.
What should have the precedence for this? @treerink @goord
@fserva, your script will have to output the velocity tendencies in m/s/d, since that is the unit ec2cmor3 will attach to the variable
I have merged the master into the
dynvar-tem-script
branch and had to resolve two conflicts: one in theifspar.json
and one in the thetaskloader.py
.@goord can you review the merged code below in the
taskloader.py
:model_vars = load_model_vars() load_masks(model_vars) load_scripts(model_vars) masks = load_masks(load_model_vars()) return create_tasks(filtered_matches, get_models(active_components), masks=masks)
I wondered whether the recently introduced masks (for the automatic masked regridding for masked fields) needs any further connection with the load scripts. I actually thought this is disconnected as this part is fully covered by the called script in that case, but the added 'load_masks(model_vars)' in this branch did confuse me a bit. So asking for a check.
Fixed the merged changes in rev. 97f2cf6bb7c6544de5b6c60da7b54e5c0414293a
I will change the units then. However this looks strange since for example utendepfd
is m s-2
while only these two variables are m s-1 d-1
, which would complicate combining them. Thanks for the tip
Hi @fserva I have committed a dummy script make_dynvars.sh
in the scripts directory of the dynvar-tem-script
branch. This script is called from ece2cmor3 because it has been referenced in the file ifspar.json
:
{
"source": "215.129",
"target": "epfy",
"post-proc": "make_dynvars"
},
...
{
"script": "make_dynvars",
"src": "make_dynvars.sh",
"filter": "False"
}
You can test your script by changing the src
field above by the name of your wrapper script. To run, make sure your script is in your PATH
. Then you can add all the variables produced by the script to ifspar.json
in the same way I defined epfy
above, with the same script field. Now ece2cmor3
will call your script whenever these variables are in a data request.
Thanks a lot, I will check this. I imagine that other variables should be 'sourced' sequentially, like 216.129 ... up to 220.129 or something like that. In case I will make a PR if I manage
right, you need to 'invent' unused grib codes for these new derived variables. It doesn't really matter which ones, as long as they are unique
Hello, I am testing this and I have two questions @goord.
First, as far as I understand 'external' scripts are to be found in ece2cmor3/scripts/. Can they also be nested there, i.e. can I have a folder (of a certain repository) and the scripts inside that? Secondly, it seems I cannot push a remote branch on ece2cmor3 (permission denied). Is there another way I could share my changes? Thanks a lot
First, as far as I understand 'external' scripts are to be found in ece2cmor3/scripts/. Can they also be nested there, i.e. can I have a folder (of a certain repository) and the scripts inside that?
Yes you can make a dynvar-scripts dir in the scripts dir in your branch
Secondly, it seems I cannot push a remote branch on ece2cmor3 (permission denied). Is there another way I could share my changes? Thanks a lot
I will have a look what your permissions are.
I have sent you a member invitation, you probably first have to confirm by clicking some identification link I guess.
Hello, I am testing this and I have two questions @goord.
First, as far as I understand 'external' scripts are to be found in ece2cmor3/scripts/. Can they also be nested there, i.e. can I have a folder (of a certain repository) and the scripts inside that? Secondly, it seems I cannot push a remote branch on ece2cmor3 (permission denied). Is there another way I could share my changes? Thanks a lot
Hi @fserva it's not necessary to add you script to the repo, if you can add it to your PATH
before launching ece2cmor3
, than it should be found automatically.
Right, you already told me about this! Thanks for the reminder.
We should think about documenting this appropriately for others BTW
Absolutely, I am taking a note in my script but perhaps this could be reported in the README file also here
It seems nco
was not added yet in the environment. Should this be done?
Also scipy
is necessary in this case (I cannot remove it entirely)
To avoid breaking things perhaps the environment file could be changed only in this branch...
Hello, I have implemented the changes in my local branch https://github.com/EC-Earth/ece2cmor3/commit/1652c48b3b38fd95d650d3af0bea9ed77dbf4872.
The external scripts work correctly after adding the PATH
and PYTHONPATH
in the platform. I need now to understand how it can be called through ece2cmor3, and how to pass arguments (exp name, directories, variable names, CMOR tables and so on)
Another update: the variables in the model output files appear to be correctly cmorized.
For those to be produced offline, I am having errors like ERROR:ece2cmor3.taskloader: Variable psitem in table EdayZ is not supported by ifs in ece2cmor3; if you do expect an ec-earth output variable here, please create an issue or pull request on our github page
.
I guess I should edit the list-of-identified-missing-cmpi6-requested-variables.xlsx
or the pextra
version to have it working? (b.t.w., it is ok that it reports cmpi6
instaed of cmip6
?
I notice that the variables are listed in list-of-ignored-cmpi6-requested-variables.xlsx
so maybe these are filtered; on the other hand, utendnogw
is in the same excel but it is cmorized correctly.
Hi @fserva, yes your variables should be removed from the ignored/identified missing excel files, these are checked by ece2cmor.
Merry Christmas
Thanks @goord I will check this. I was confused by the fact that some variables were processed despite being in the xlsx.
Happy holidays to you!
While testing (after doing a setup.py install
) I am having these kind of errors:
2020-12-26 17:52:32 ERROR:ece2cmor3.ifs2cmor: Could not find post-processing script None in ifspar.json, dismissing variable tntc in table EmonZ
( same for a lot of variables)2020-12-26 17:52:37 ERROR:ece2cmor3.ifs2cmor: Error in calling script /perm/ms/it/cc1f/ecearth3/post/tem- diag/make_dynvars.sh: None
The second one might be related on how the script is called, however the first one is more puzzling since it is thrown for a variable that was correctly processed before, and it does not require a script.
I restored the old excel files, so they should not be the problem (it is strange that tntrlcs_EmonZ.nc
is processed while being at line 243 of the ignored
excel). ThanksHi @fserva, best wishes to you. I pushed a fix of the typo causing the first error message. I also improved the error handling, so I hope the message from your second error is more useful
Hello @goord happy new year to you! I have commented your commit, it seems the problem is with the external script not getting the required parameters. I also think the wrong (GG
) file is passed instead of the SH
one.
Hello @goord and @treerink, an update on the integration of the external script. Now the execution works well and a set of files are produced. But there is a problem related I think with the timing of the execution.
The production of derived variables requires ~50min/month, while the cmorization of basic variables is faster (total ~3-4 hours). It seems that ece2cmor did not wait for the external script to end, so I only have the first 6 months processed and the cmorization failed for these variables (the output folder is not removed).
Is there a way to have ece2cmor waiting for the external process? Thanks
PS this is how the log looks like, the cdo commands are halted and execution stops
cdo merge: Processed 1462763520 values from 3 variables over 360 timesteps [28.40s 125MB]
#[18:26:39] make_dynvars.sh[118]> cdo selvar,var130 -ml2plx,101205,100915,100512,99984,99330,98563,97667,96617,9
5409,94044,92522,90844,89013,87038,84925,82681,80316,77840,75267,72607,69870,67073,64229,61350,58449,55540,52646
,49796,47017,44334,41773,39344,37042,34862,32799,30848,29002,27256,25607,24048,22577,21187,19876,18638,17472,163
72,15335,14359,13440,12575,11759,10989,10258,9562,8896,8257,7643,7051,6480,5931,5407,4907,4433,3985,3563,3167,27
99,2457,2141,1852,1588,1350,1137,947,780,634,509,402,312,238,178,130,92,64,43,28,17,10,6,3,1 ./t4cv_t_ml_185006.
nc_1 ./t4cv_t_pl_185006.nc
cdo(2) ml2plx: Process started
2021-01-06 18:27:06 ERROR:ece2cmor3.ifs2cmor: Output file /scratch/ms/it/cc1f/tmp_cmor/t4cv_1850_10290/ifs_1850/
CMIP6/t4cv-ifs-1850/make_dynvars-work/utendvtem_EdayZ.nc of script /perm/ms/it/cc1f/ecearth3/post/tem-diag/make_
dynvars.sh could not be found... skipping cmorization of task
2021-01-06 18:27:06 ERROR:ece2cmor3.ifs2cmor: Output file /scratch/ms/it/cc1f/tmp_cmor/t4cv_1850_10290/ifs_1850/
CMIP6/t4cv-ifs-1850/make_dynvars-work/utendwtem_EdayZ.nc of script /perm/ms/it/cc1f/ecearth3/post/tem-diag/make_
dynvars.sh could not be found... skipping cmorization of task
...other variables skipped...
2021-01-06 18:27:06 WARNING:ece2cmor3.ifs2cmor: Skipped removal of nonempty work directory /scratch/ms/it/cc1f/t
mp_cmor/t4cv_1850_10290/ifs_1850/CMIP6/t4cv-ifs-1850
! ------
! All files were closed successfully.
! ------
Hi @fserva you mean the working directory is removed before your script has finished? I will have a look at it in the evening. In the meantime, you can disable the cleanup by defining the environment variable export ECE2CMOR3_IFS_CLEANUP=FALSE
before you launch ece2cmor.
Hi @goord no the working directory is left untouched since non-empty (I've pasted the log now), but ece2cmor finishes the execution without waiting for the slow processing still ongoing. I don't think the lack of SH
files is the problem since the data
folder and links are still there.
I guess ece2cmor stops the child process when other variables are done
Hi @fserva I had a quick look at the code and I think somehow your script has failed after a few months of processing. After your script is finished, ece2cmor3
will proceed to cmorize the nc
files you have created, but it can't find those. However, ec2cmor3
will always wait for your script to return because it invokes it via subprocess.check_output
(see line 306 of ifs2cmor.py
), which is a synchronous function call.
Do you see the file /scratch/ms/it/cc1f/tmp_cmor/t4cv_1850_10290/ifs_1850/CMIP6/t4cv-ifs-1850/make_dynvars-work/utendvtem_EdayZ.nc
?
Maybe that shell=False
can cause this behaviour (just guessing here)? https://stackoverflow.com/questions/36169571/python-subprocess-check-call-vs-check-output
But with shell=True
the arguments are not passed.
Also this seems interesting: https://stackoverflow.com/questions/43990219/how-does-subprocess-call-work-with-shell-false
No the /scratch/ms/it/cc1f/tmp_cmor/t4cv_1850_10290/ifs_1850/CMIP6/t4cv-ifs-1850/make_dynvars-work/utendvtem_EdayZ.nc
is not produced since that would be generated by concatenating the outputs of month 1 to 12 by a cdo mergetime
at the very end of make_dynvars.sh
. The problem is that only 5 months and a bit are processed so these files named variable_table.nc
are yet to appear.
Does your script launch child processes (via the &
in bash or subprocess/multiprocess in python)?
Have you tried executing the script outside of ece2cmor
, by just passing the arguments yourself?
No, no child processes I think. Only an executable python script which I added to PYTHONPATH
as suggested. Yes I have tested days ago and I did not make big changes since then. I will test again just to be sure...
Because the process seems to end in the 6th month (not the first), I don't think there is any fundamental flaw with the subprocess call. It could e.g. be that cdo crashes with a zero status code (this shouldn't happen, but you never know with cdo), making ece2cmor3
think that the script execution was ok.
I can also recommend putting debug echo statements between all cdo calls.
Ok I will make another attempt to see if cdo is silently exiting. I've set the verbose mode for the script but there is no information there. It will take some hours, thanks for now
From your output above I suspect something goes wrong in the ml2plx call for variable 130. This could be also unsufficient memory or disk space. Are you submitting to the parallel queue with a full node at your disposal?
Running on the fractional one (nf
) and now I also take care of removing the big grib/netCDF files which may take too much space. However my $SCRATCH
should have enough space. Indeed it may also be the system to kill the process for some reason
Also you could comment out all the removals of intermediate files in the script, to get a better idea where things go bad.
Another idea is to remove the 'regular' variables from the varlist, then ece2cmor3 will run only your script variables. This may shine a light on whether there is interference from other tasks.
Thanks for the tips. Good idea about removing variables! I'll try editing the ifspar
keeping only a small selection of standard variables. This should clarify what is going wrong with the external process indeed
Hi @fserva no need to edit ifspar.json! Just look for the --varlist
or --drq
option in the ece2cmor call and edit that file, only keeping your tendencies in the data request
Some preliminary results: I have removed most variables from the dreq json file and the execution seems to be going on.
The scripts is running, and all the other variables should have been cmorized (the files are in the working directory named like variable_table.nc
; I imagine all the files should be moved from their current to the cmorized folder at the end of the process?
So +1 for the occasional cdo error so far...
Update: currently past month 6, processing month 7. The random cdo error is getting more credit, so I may need to add some trap in the bash script!
Yes once all your pre-processed nc-files have been successfully produced, ece2cmor3
will process them and create the cmorized files in the final output directory. You don't have to move them anywhere in your script.
Nope, another failure, apparently similar to the previous one:
#[18:14:55] make_dynvars.sh[59]> fvar=./t4cv_u_ml_185007.nc
#[18:14:55] make_dynvars.sh[61]> cdo -R -f nc -sp2gpl ./t4cv_u_ml_185007.grb ./t4cv_u_ml_185007.nc
2021-01-07 18:17:01 ERROR:ece2cmor3.ifs2cmor: Output file /scratch/ms/it/cc1f/tmp_cmor/t4cv_1850_11723/ifs_1850/
CMIP6/t4cv-ifs-1850/make_dynvars-work/utendvtem_EdayZ.nc of script /perm/ms/it/cc1f/ecearth3/post/tem-diag/make_
dynvars.sh could not be found... skipping cmorization of task
But I may have figured out the problem. Also now runtime is about 4 hours. I think somewhere a limit of 4 hours is set, but evidently it is too short for this. I'll try increasing the time limit to 12 hours... Perhaps this also explain why the scheduler quits without much explanation.
Hello @goord this time the process went further but I believe there is a problem with the cmorization due to the fact that variables are zonal mean.
As you can see the script ends ok...
#[02:28:59] make_dynvars.sh[218]> exit 0
2021-01-08 02:28:59 INFO:ece2cmor3.ifs2cmor: Cmorizing source variable 221 to target variable utendvtem...
2021-01-08 02:29:02 ERROR:ece2cmor3.ifs2cmor: Could not retrieve x-coordinate values from {'gridtype': 'gaussian', 'gridsize': 256, 'ysize': 256, 'yunits': '"degrees_north"', 'yname': 'lat', 'np': 128, 'ylongnam
e': '"latitude"', 'yvals': array([ 89.46282157, 88.76695135, 88.06697165, 87.36606343,
[...]
2021-01-08 02:29:02 ERROR:ece2cmor3.ifs2cmor: Invalid grid detected in post-processed data: {'gridtype': 'gaussian', 'gridsize': 256, 'ysize': 256, 'yunits': '"degrees_north"', 'yname': 'lat', 'np': 128, 'ylongn
ame': '"latitude"', 'yvals': array([ 89.46282157, 88.76695135, 88.06697165, 87.36606343,
[...]
File "/perm/ms/it/cc1f/miniconda2/envs/ece2cmor3/lib/python2.7/site-packages/ece2cmor3-1.6.1-py2.7.egg/ece2cmor3/ifs2cmor.py", line 538, in define_cmor_axes
grid_id = grid_ids[0]
TypeError: 'NoneType' object has no attribute '__getitem__'
Should I write somewhere that the grid has no x-values, being zonal mean?
Hi @fserva, I will take over from here, if you can send me some of your files I can adapt the code to correctly cmorize it. Do you have an ftp server or google drive so I can get at least 1 of the nc files produced by your script?
This issue is related to ec-earth portal issue 898.
For some zonal mean (wave atmos TEM) variables Federico Serva has written a python script to compute these results. See issue 1 in that repository.
What is needed here, is that
ece2cmor
allows an option in which a (python) script is called instead of the calls to cdo. We aim for an interface via the ifspar.json. An additional option, like the expression option, should allow the external script call.