Closed pgierz closed 17 hours ago
For developers: the unitless mapping table is read when the CMORizer object gets initialised (either directly, this is uncommon) or via the classmethod
from_dict
(this is the common case), and attached to all rules that the CMORizer
knows about. You can access them on the rule object as attribute dimensionless_unit_mappings
.
@pgierz , nice to see the dimensionless_unit_mappings
set on rule. I guess unit conversion function should looked out for this attribute and act accordingly. If so, I will make necessary changes to that function. One follow up question: Should the units in the final output point to units in the CMIP table or the one used in this unit_mapping? For instance, "0.001" is the unit in the table. Its unit_mapping says "g/kg". The later is used by the unit conversion function. I guess the unit representation in the final output should still be "0.001" as suggested in the table. Right?
I guess the unit representation in the final output should still be "0.001" as suggested in the table. Right?
Correct. At the end of the CMOR process, we need to have metadata attributes as defined in the CMOR Tables.
Hi guys, just so I make sure I am following, this are the steps and tooling needed correct?
dimensionless_unit_mappings
. @pgierz this is also what you are saying here as well right?the unitless mapping table is read when the CMORizer object gets initialised (either directly, this is uncommon) or via the classmethod from_dict (this is the common case), and attached to all rules that the CMORizer knows about
[x] Prepare checks raise errors if:
[x] Modify the unit conversion to get the attribute from CMOR table where we stored the physically meaningfull unit (g/kg) and use that to do the conversion.
Any step missing?
I'd suggest we start writing the dimensionless_unit_mappings yaml with just one variable to test, for example one of the salinities that @siligam have shown from his tables.
- read the dimensionless_unit_mappings and assign them to their corresponding CMOR variables. @pgierz this is also what you are saying here as well right?
the unitless mapping table is read when the CMORizer object gets initialised (either directly, this is uncommon) or via the classmethod from_dict (this is the common case), and attached to all rules that the CMORizer knows about
Partially. The mappings are read and all are assigned to each Rule
: no filtering is done. If you want each Rule
to only know about one specific mapping, that is easy enough to do, I can add that.
I'd suggest we start writing the dimensionless_unit_mappings yaml with just one variable to test, for example one of the salinities that @siligam have shown from his tables.
Probably it would be a good idea to turn that into a unit test so it is run automatically.
I will implement the tastcase for salinity using the dimensionless_unit_mappings feature.
Just for reference, model-table unit listing https://notes.desy.de/CXe0axlgSbStWgOLjaXnlQ?view
- read the dimensionless_unit_mappings and assign them to their corresponding CMOR variables. @pgierz this is also what you are saying here as well right?
the unitless mapping table is read when the CMORizer object gets initialised (either directly, this is uncommon) or via the classmethod from_dict (this is the common case), and attached to all rules that the CMORizer knows about
Partially. The mappings are read and all are assigned to each
Rule
: no filtering is done. If you want eachRule
to only know about one specific mapping, that is easy enough to do, I can add that.
I'm not understanding something: don't the rules belong to the model side of things?
Why would we need to append it to the rule and not to the CMOR object? The mapping is for the CMOR unit, not the model unit. Please guys, let me know if I need to reread the documentation or the code, I might be saying bullshit here
Please guys, let me know if I need to reread the documentation or the code, I might be saying bullshit here
This is badly documented, so read the code ;-) I am talking about the CMORizer
object here (the main controller) which has information about everything It's the part that controls the parallelisation and knows which rules exist, what the user configuration looks like, how to match up model files to cmor table entries etc etc.
At the moment, it looks like this:
>>> user_config_file = pathlib.Path("/some/file.yaml")
>>> f = yaml.safe_load(user_config_file)
>>> cmorizer = CMORizer.from_dict(f)
>>> type(cmorizer.rules)
list
>>> first_rule = cmorizer.rules[0]
>>> first_rule.model_variable
"sst"
>>> first_rule.cmor_variable
"sos"
>>> first_rule.dimensionless_unit_mappings
{"sst": {"0.001": "g/kg"}, "other_key": {"1": "µmol mol^-1"}, ...}
Of course this example is nonsense, sst isn't a concentration, but just for illustrative purposes of what sorts of things are where in the objects.
I guess the unit representation in the final output should still be "0.001" as suggested in the table. Right?
Correct. At the end of the CMOR process, we need to have metadata attributes as defined in the CMOR Tables.
Yes, the units in the tables are the target, they are the convention.
So, I have integrated dimensionless_mappings logic into unit conversion function and it appears work as expected. I have run examples/sample.yaml
which produced the following:
The slurm log shows the expected unit conversion details:
> grep Converting slurm-13789839.out
Converting units: (thetao -> thetao) degC -> degC
Converting units: (CO2f -> fgco2) mmolC/m2/d -> kg m-2 s-1
Converting units: (so -> so) psu -> g/kg (0.001)
The units in processed netcdf file shows the correct unit as per the Table.
> ncdump -h so_Omon_AWI-ocean_piControl_r1i1p1f1_gn_278401-278412.nc | grep so:units
so:units = "0.001" ;
What I am still missing here is a unit test for the dimensionless unit conversion.
I'm going to have to do a force push on this branch at some point, there are a few features mixed together in this one:
~@mandresm: is it worth it, just to keep things "clean"? Or more confusing?~
Update: No force push needed, changes in this PR now exclusively deal with the relevant conversion logic.
Regarding getting unit information like g/kg in addition to the fractional scalar: I had a look. The physical units of a quantity appear to be hidden deep down in the details of physical parameters (https://bit.ly/CMIP-DR-physical-parameters). The Variables defined in the CMOR tables then refer to the different physical parameters and inherit their unit information.
Commenting on individual physical parameters is possible via this form: https://airtable.com/appqRFkdpwAitEZNY/pagBlOWjK6JfvHtzU/form
Doing so is possible, but with significant work per parameter. Are we still confident that this additional information is necessary for a proper workflow? If so, I will write to CMIP-IPO to see what they can and are willing to do from a more fundamental level. But at the moment it does not appear to me to boil down to just adding another comment or field to the data base at hand.
This is just the way I understand the information on airtable - let me know if you have a different understanding.
Thanks @christian-stepanek for looking into this!
The Variables defined in the CMOR tables then refer to the different physical parameters and inherit their unit information.
Meaning that for example albsrfc
will inherit 1
as unit right?
Doing so is possible, but with significant work per parameter. Are we still confident that this additional information is necessary for a proper workflow? If so, I will write to CMIP-IPO to see what they can and are willing to do from a more fundamental level. But at the moment it does not appear to me to boil down to just adding another comment or field to the data base at hand.
This is just the way I understand the information on airtable - let me know if you have a different understanding.
In my view this is not additional information, is just that 1
or 0.001
is not a unit, IMO it's fundamentally wrong, and programatically imprecise. If anything 1
speaks of a range (from 0 to 1) and 0.001
speaks of a ratio. This is not what I expect from a standard for doing international model comparisons. I am aware I am extremely picky though and I might not be getting the full picture here...
Is it possible to ask their opinion to other members of the community before starting to formally discuss in airtable? Or is airtable the correct place to open the discussion?
In practical terms for pymorize, it is very little work for us to provide the mapping functionality, so I don't see a big technical problem for us.
@pgierz, does this commit make sense to you or is there a better way of doing this keeping the data
directory in the root directory? bc5586a
Blocked by #66
Tests are failing, I'll look into that on Monday
Work for dimensionless units. Will close #54