NCAS-CMS / cf-python

A CF-compliant Earth Science data analysis library
http://ncas-cms.github.io/cf-python
MIT License
119 stars 19 forks source link

`aggregate`: explicit feedback about un-aggregatable outcome #789

Open sadielbartholomew opened 2 months ago

sadielbartholomew commented 2 months ago

(In our summit today) a user has conveyed that they would like to get explicit notification when a call to cf.aggregate doesn't aggregate the input any further, e.g for the case of h from the example below unless verbosity is set to at least level of 2/info there is nothing that indicates that the entire FieldList was not aggregatable further, i.e. aggregation didn't work, and even at level 2 you have to count the fields that are mentioned as being unaggregatable or check the length of the resultant FieldList to confirm this.

I agree that the above is not ideal since lack of combination of fields from the input could easily be seen as a 'failure' outcome and therefore to me merits at least a 3/warning level output, with a summary line to cover the whole FieldList result and not just each field individually as per the log output from 2.

Example

>>> import cf
>>> f = cf.example_fields()
>>> g = cf.aggregate(f)  # pre-aggregate down
>>> 
>>> len(g)
11
>>> h = cf.aggregate(g)  # won't aggregate further
>>> len(h)
11
>>> h = cf.aggregate(g, verbose=2)  # won't aggregate further
Unaggregatable 'air_temperature' has been output: <CF AuxiliaryCoordinate: long_name=Grid latitude name(10) > has no identity or no data
Unaggregatable 'precipitation_flux' has been output: <CF AuxiliaryCoordinate: cf_role=timeseries_id(4) > has no identity or no data
Unaggregatable 'air_temperature' has been output: <CF AuxiliaryCoordinate: cf_role=timeseries_id(3) > has no identity or no data
Unaggregatable 'precipitation_amount' has been output: <CF AuxiliaryCoordinate: cf_role=timeseries_id(2) > has no identity or no data
Unaggregatable 'mole_fraction_of_ozone_in_air' has been output: <CF AuxiliaryCoordinate: cf_role=trajectory_id(1) > has no identity or no data

My suggestion is, specifically, that h = cf.aggregate(g) # won't aggregate further will here report a line stating something along the lines of "FieldList was not aggregatable." and perhaps we can make the suggestion of (after the previous) "Try applying further keywords to relax the aggregation conditions if you wish to try to combine the fields further.".

davidhassell commented 1 month ago

Thanks for summarising this discussion, Sadie. Would making the default verbose=2 be sufficient? Followed by a single helpful hint message if any "unaggregatable" messages were printed? I'm not too keen on highlighting when the number of fields in equals the number of fields out - that's not always an error!