NCAS-CMS / cfdm

A Python reference implementation of the CF data model
http://ncas-cms.github.io/cfdm
MIT License
28 stars 11 forks source link

Initiation of a Domain Axis: clarify `source` API compat. & fix `size` #311

Open sadielbartholomew opened 2 months ago

sadielbartholomew commented 2 months ago

Using cfdm.DomainAxis (and filtering downstream to cf-python, so cf.DomainAxis equivalently) with the source keyword to create a Domain Axis construct works fine when the source it itself a Domain Axis, but when it is a coordinate, the produced construct has not size set and is given a generic key name, when these should be inferred from the coordinate data (unless I am missing something and this can't be done), especially from the statement in the documentation for that parameter which says (my bold to highlight):

Convert source, which can be any type of object, to a DomainAxis instance.

All other parameters, apart from copy, are ignored and their values are instead inferred from source by assuming that it has the DomainAxis API. Any parameters that can not be retrieved from source in this way are assumed to have their default value.

To illustrate:

>>> # Use an example field to illustrate with existing constructs - see what we have first
>>> f = cfdm.example_field(1)
>>> f.domain_axes(todict=True)
{'domainaxis0': <DomainAxis: size(1)>, 'domainaxis1': <DomainAxis: size(10)>, 'domainaxis2': <DomainAxis: size(9)>, 'domainaxis3': <DomainAxis: size(1)>}
>>> f.dimension_coordinates(todict=True)
{'dimensioncoordinate0': <DimensionCoordinate: atmosphere_hybrid_height_coordinate(1) m>, 'dimensioncoordinate1': <DimensionCoordinate: grid_latitude(10) degrees>, 'dimensioncoordinate2': <DimensionCoordinate: grid_longitude(9) degrees>, 'dimensioncoordinate3': <DimensionCoordinate: time(1) days since 2018-12-01 >}
>>>
>>> # Now try creating a new Domain Axes with some existing constructs
>>> # This works fine
>>> da1 = f.construct("domainaxis1")
>>> da1
<DomainAxis: size(10)>
>>> new_domain_axes_0 = cfdm.DomainAxis(source=da1)
>>> new_domain_axes_0
<DomainAxis: size(10)>
>> # But this does not
>>> dc1 = f.dimension_coordinate("dimensioncoordinate1")
>>> dc1
<DimensionCoordinate: grid_latitude(10) degrees>
>>> new_domain_axes_1 = cfdm.DomainAxis(source=dc1)
>>> new_domain_axes_1
<DomainAxis: size()>
>>> # Note the above has no size set, but we expect it have size of dc1 namely size 10.
>>> # Note also the key names are generic for the dim coord case, when get set on the field:
>>> f.set_construct(new_domain_axes_0)
'domainaxis4'
>>> f.set_construct(new_domain_axes_1)
'domainaxis5'
>>> f.domain_axes(todict=True)
{'domainaxis0': <DomainAxis: size(1)>, 'domainaxis1': <DomainAxis: size(10)>, 'domainaxis2': <DomainAxis: size(9)>, 'domainaxis3': <DomainAxis: size(1)>, 'domainaxis4': <DomainAxis: size(10)>, 'domainaxis5': <DomainAxis: size()>}
>>> f.dump()
---------------------------------
Field: air_temperature (ncvar%ta)
---------------------------------
Conventions = 'CF-1.11'
project = 'research'
standard_name = 'air_temperature'
units = 'K'

Data(atmosphere_hybrid_height_coordinate(1), grid_latitude(10), grid_longitude(9)) = [[[262.8, ..., 269.7]]] K

Cell Method: grid_latitude(10): grid_longitude(9): mean where land (interval: 0.1 degrees)
Cell Method: time(1): maximum

Field Ancillary: air_temperature standard_error
    standard_name = 'air_temperature standard_error'
    units = 'K'
    Data(grid_latitude(10), grid_longitude(9)) = [[0.76, ..., 0.32]] K

Domain Axis: atmosphere_hybrid_height_coordinate(1)
Domain Axis: grid_latitude(10)
Domain Axis: grid_longitude(9)
Domain Axis: key%domainaxis5()
Domain Axis: ncdim%y(10)
Domain Axis: time(1)
...

As a side note, the size parameter is accepting arguments that aren't integers, which should also be fixed:

>>> # Further problem - can assign invalid sizes
>>> cfdm.DomainAxis(size="bad size, I'm a string!")
<DomainAxis: size(bad size, I'm a string!)>
>>> cfdm.DomainAxis("bad size, I'm a string!")
<DomainAxis: size(bad size, I'm a string!)>

Environment

Platform: Linux-6.5.13-7-MANJARO-x86_64-with-glibc2.40 
HDF5 library: 1.14.2 
netcdf library: 4.9.2 
Python: 3.12.0 /home/slb93/miniconda3/envs/cf-env-312/bin/python
netCDF4: 1.6.5 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/netCDF4/__init__.py
numpy: 1.26.4 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/numpy/__init__.py
cfdm.core: 1.11.1.0 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/cfdm/core/__init__.py
scipy: 1.14.0 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/scipy/__init__.py
cftime: 1.6.2 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/cftime/__init__.py
netcdf_flattener: 1.2.0 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/netcdf_flattener/__init__.py
cfdm: 1.11.1.0 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/cfdm/__init__.py
davidhassell commented 2 months ago

Hi Sadie, I see where you're coming from, but DimensionCoordinate and DomainAxis do not have a sufficiently similar API in this case, so I think that we shouldn't be surprised that former can't correctly instantiate the latter - in particular there is no DimensionCoordinate.set_size method.

I appreciate that "same API" is slightly vague term. Perhaps we can qualify by saying that "same API for setting the initialisation parameters", or similar

It so happens that dimension coordinates are always 1-d, so it's tempting to think that it's array size could define an axis axis size, but that is clearly not the case for constructs in general - what if the construct had 4-d data?. Is new_domain_axes_1 = cfdm.DomainAxis(dc1.size) OK for you in this case?

As a side note, the size parameter is accepting arguments that aren't integers, which should also be fixed:

I agree!

sadielbartholomew commented 2 months ago

Ah, OK, thanks for clarifying David!

I appreciate that "same API" is slightly vague term. Perhaps we can qualify by saying that "same API for setting the initialisation parameters", or similar

Yes, the vagueness is probably the origin of my issue. My interpretation was, a coordinate of any form, as long of course as the coordinates were 1D, should be trivial enough to 'convert' to a Domain Axis with size and name detected. But I wasn't appreciating the subtleties. It would indeed be good to update the phrasing to aim to prevent further confusion from users.

Is new_domain_axes_1 = cfdm.DomainAxis(dc1.size) OK for you in this case?

Yes, though when we catch up today I suspect you will help me realise I don't need to create a domain axis in the first place - in brief, this is for VISION WRF model data work where I have created vertical coordinates emerging as a auxiliary coordinate and ultimately I want the subspace on the z coorindate to use that and not the parametric z coordinate (which isn't working, it is unitless so incomparable anyhow at least I think). My plan was to change the data axes to replace the current parametric z domain axes with the non-parametric calculated one - but hopefully there is a more direct way!

I agree!

I'll change the title here to reflect the much more trivial bug here :)