NCAS-CMS / cf-python

A CF-compliant Earth Science data analysis library
http://ncas-cms.github.io/cf-python
MIT License
120 stars 19 forks source link

field.collapse() fails with measure with redundant dimension #278

Closed dlrhodson closed 1 year ago

dlrhodson commented 2 years ago

Hi Folks,

I've encountered a problem with field.collapse() (cf 3.11.0) when using a measure with a redundant dimension. See attached collapse_error.py for an example. The problem appears to be this line in field.py

        iaxes = [clm_axes0.index(axis) for axis in clm_axes]

examining the next line:

        clm.squeeze(iaxes, inplace=True)

I think iaxes is supposed to be the indices of the axes in clm_axes0 that don't appear in clm_axes - but this line does the opposite of this, and clm.squeeze() then attempts to squeeze the axes that are of greater that unit length (and fails).

in the attached collapse_error_fixed.py I suggest a fix that seems to work for me

          iaxes = [clm_axes0.index(axis) for axis in (set(clm_axes0)-set(clm_axes))]

which produces the indices of the axes in clm_axes0 that don't appear in clm_axes.

This error only occurs in the unusual case where the measure has a redundant unit dimension - and only occurred because I was attempting to use a time-varying volume measure (in the ocean) - so it may never affect anyone else!

Cheers, Dan

collapse_error.py.gz collapse_error_fixed.py.gz

sadielbartholomew commented 2 years ago

Hi Dan, thanks for reporting this and for all of the detail you have provided. We will take a look shortly and get back to you (probably by the end of the week)!

sadielbartholomew commented 2 years ago

Hi again @dlrhodson, I downloaded the files you attached and have had a look, but without the specific file referenced in those scripts (which you had at location "/badc/cmip6/data/CMIP6/CMIP/MOHC/HadGEM3-GC31-LL/historical/r1i1p1f3/Omon/volcello/gn/latest/volcello_Omon_HadGEM3-GC31-LL_historical_r1i1p1f3_gn_185001-189912.nc"), I can't be certain what error or failure emerges, nor deduce that from your notes.

I grabbed what seems to be the corresponding file from the ESGF data nodes via running the command:

$ wget http://esgf-data3.ceda.ac.uk/thredds/fileServer/esg_cmip6/CMIP6/CMIP/MOHC/HadGEM3-GC31-LL/historical/r1i1p1f3/Omon/volcello/gn/v20190624/volcello_Omon_HadGEM3-GC31-LL_historical_r1i1p1f3_gn_185001-189912.nc

and then when I run your scripts I see:

Traceback (most recent call last):
  File "collapse_error.py", line 19, in <module>
    data1.collapse('volume: integral',weights='volume',measure=True)
  File "/home/sadie/cf-python/cf/decorators.py", line 62, in precede_with_kwarg_deprecation_check
    operation_method_result = operation_method(self, *args, **kwargs)
  File "/home/sadie/anaconda3/envs/cf-env/lib/python3.8/site-packages/cfdm/decorators.py", line 171, in verbose_override_wrapper
    return method_with_verbose_kwarg(*args, **kwargs)
  File "/home/sadie/cf-python/cf/field.py", line 8638, in collapse
    d_weights = f.weights(
  File "/home/sadie/cf-python/cf/field.py", line 5468, in weights
    self._weights_measure(
  File "/home/sadie/cf-python/cf/field.py", line 3905, in _weights_measure
    clm.squeeze(iaxes, inplace=True)
  File "/home/sadie/cf-python/cf/decorators.py", line 62, in precede_with_kwarg_deprecation_check
    operation_method_result = operation_method(self, *args, **kwargs)
  File "/home/sadie/anaconda3/envs/cf-env/lib/python3.8/site-packages/cfdm/decorators.py", line 44, in inplace_wrapper
    processed_copy = operation_method(self, *args, **kwargs)
  File "/home/sadie/cf-python/cf/data/data.py", line 14129, in squeeze
    raise ValueError(
ValueError: Can't squeeze Data: Can't remove axis of size 75

Can you just confirm that this is the ultimate precise error you run into (this does indeed seem to by fixed or bypassed at least with your extra code in the _fixed script)? In which case I can advise accordingly. Thanks.

dlrhodson commented 2 years ago

Hi Sadie! Yes - sorry, I should have noted that I ran the script on JASMIN - and I should have included a short version of the file in the upload. That is the ultimate precise error that I get:

image

Does the collapse_error_fixed.py work for you?

Cheers! Dan

davidhassell commented 2 years ago

Hi Dan,

Thanks for spotting this, your analysis looks sound to me. I wonder if we can do an simpler solution, though:

        clm = clm.get_data(_fill_value=False)
        clm = clm.squeeze()

because calling the squeeze method without any arguments will find and squeeze all size 1 dimensions, if any. I don't have access to the data at the moment, but do you want to try this alternative fix (and feel free to put in PR :)).

Thanks, David

dlrhodson commented 2 years ago

Yes - that works for me too. Much better! I will try and work out how to do a PR!

davidhassell commented 1 year ago

Closing now - please reopen if required.