NCAS-CMS / cf-python

A CF-compliant Earth Science data analysis library
http://ncas-cms.github.io/cf-python
MIT License
125 stars 19 forks source link

Error in read_record_data when processing 6-hourly UM data #817

Open ellgil82 opened 1 month ago

ellgil82 commented 1 month ago

I'm getting an error when trying to collapse UM files to go from sub-daily to daily frequency. The data are 6-hourly maximum air temperature (max across all timesteps).

I'm new to cf-python so unsure about where/how to start with debugging this, and read_record_data doesn't tell me much about the source of the error:

umfile: error condition detected in routine read_record_data_core_dbl
umfile: error condition detected in routine read_record_data
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/apps/jasmin/jaspy/miniforge_envs/jaspy3.11/mf3-23.11.0-0/envs/jaspy3.11-mf3-23.11.0-0-v20240815/lib/python3.11/site-packages/cf/decorators.py", line 71, in precede_with_kwarg_deprecation_check
    operation_method_result = operation_method(self, *args, **kwargs)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/jaspy/miniforge_envs/jaspy3.11/mf3-23.11.0-0/envs/jaspy3.11-mf3-23.11.0-0-v20240815/lib/python3.11/site-packages/cfdm/decorators.py", line 171, in verbose_override_wrapper
    return method_with_verbose_kwarg(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/jaspy/miniforge_envs/jaspy3.11/mf3-23.11.0-0/envs/jaspy3.11-mf3-23.11.0-0-v20240815/lib/python3.11/site-packages/cf/field.py", line 6931, in collapse
    f = f._collapse_grouped(
        ^^^^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/jaspy/miniforge_envs/jaspy3.11/mf3-23.11.0-0/envs/jaspy3.11-mf3-23.11.0-0-v20240815/lib/python3.11/site-packages/cfdm/decorators.py", line 171, in verbose_override_wrapper
    return method_with_verbose_kwarg(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/jaspy/miniforge_envs/jaspy3.11/mf3-23.11.0-0/envs/jaspy3.11-mf3-23.11.0-0-v20240815/lib/python3.11/site-packages/cf/field.py", line 8628, in _collapse_grouped
    f = self.concatenate(fl, axis=iaxis, cull_graph=True)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/jaspy/miniforge_envs/jaspy3.11/mf3-23.11.0-0/envs/jaspy3.11-mf3-23.11.0-0-v20240815/lib/python3.11/site-packages/cf/field.py", line 3189, in concatenate
    new_data = Data.concatenate(
               ^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/jaspy/miniforge_envs/jaspy3.11/mf3-23.11.0-0/envs/jaspy3.11-mf3-23.11.0-0-v20240815/lib/python3.11/site-packages/cf/data/data.py", line 4068, in concatenate
    d.cull_graph()
  File "/apps/jasmin/jaspy/miniforge_envs/jaspy3.11/mf3-23.11.0-0/envs/jaspy3.11-mf3-23.11.0-0-v20240815/lib/python3.11/site-packages/cf/data/data.py", line 11470, in cull_graph
    dsk, _ = cull(dx.dask, dx.__dask_keys__())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/jaspy/miniforge_envs/jaspy3.11/mf3-23.11.0-0/envs/jaspy3.11-mf3-23.11.0-0-v20240815/lib/python3.11/site-packages/dask/optimization.py", line 61, in cull
    dependencies_k = get_dependencies(dsk, k, as_list=True)  # fuse needs lists
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/jaspy/miniforge_envs/jaspy3.11/mf3-23.11.0-0/envs/jaspy3.11-mf3-23.11.0-0-v20240815/lib/python3.11/site-packages/dask/core.py", line 306, in get_dependencies
    return keys_in_tasks(dsk, [arg], as_list=as_list)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/jaspy/miniforge_envs/jaspy3.11/mf3-23.11.0-0/envs/jaspy3.11-mf3-23.11.0-0-v20240815/lib/python3.11/site-packages/dask/core.py", line 194, in keys_in_tasks
    if w in keys:
       ^^^^^^^^^
  File "<frozen _collections_abc>", line 780, in __contains__
  File "/apps/jasmin/jaspy/miniforge_envs/jaspy3.11/mf3-23.11.0-0/envs/jaspy3.11-mf3-23.11.0-0-v20240815/lib/python3.11/site-packages/dask/highlevelgraph.py", line 517, in __getitem__
    return self.layers[key[0]][key]  # type: ignore
                       ~~~^^^
  File "/apps/jasmin/jaspy/miniforge_envs/jaspy3.11/mf3-23.11.0-0/envs/jaspy3.11-mf3-23.11.0-0-v20240815/lib/python3.11/site-packages/cf/data/array/umarray.py", line 190, in __getitem__
    array = rec.get_data().reshape(self.shape)
            ^^^^^^^^^^^^^^
  File "/apps/jasmin/jaspy/miniforge_envs/jaspy3.11/mf3-23.11.0-0/envs/jaspy3.11-mf3-23.11.0-0-v20240815/lib/python3.11/site-packages/cf/umread_lib/umfile.py", line 435, in get_data
    return c.read_record_data(
           ^^^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/jaspy/miniforge_envs/jaspy3.11/mf3-23.11.0-0/envs/jaspy3.11-mf3-23.11.0-0-v20240815/lib/python3.11/site-packages/cf/umread_lib/cInterface.py", line 591, in read_record_data
    raise umfile.UMFileException("Error reading record data")
cf.umread_lib.umfile.UMFileException: Error reading record data

Any help would be appreciated!

ellgil82 commented 1 month ago

Update: looking at the data with iris has shown this is probably related to WGDOS packing errors.

sadielbartholomew commented 1 month ago

Hi @ellgil82, thanks for your question. I can't immediately see what the issue might be from your traceback - you are right it isn't the most informative error report that emerges, and we will look to improve on it.

Unless we have the specific dataset to play around with, we can't recreate the specific issue you might be hitting - so if there's a way you can point us to the datasets in question (do you use or have access to JASMIN, since a path to the datasets there would?) else share it, that would be particularly useful.

For now, if you specify verbose=-1 or equivalently verbose="DEBUG" to your collapse call (e.g. f.collapse("minimum", verbose=-1)), this turns the verbosity 'up to 11' (for info see https://ncas-cms.github.io/cf-python/tutorial.html#controlling-output-messages if useful) and should provide more information that might help to pinpoint the issue, though since most of our UM reading capability is done in C under-the-hood, I am not sure how much useful information may emerge... Worth a try though, so please share the output here (or relevant parts of it, the debug output can be very dense).

ellgil82 commented 1 month ago

thanks Sadie for this - I've tried with the verbose output option but not getting much more in the error description. Just seems to pop up halfway through the collapse.

Did a bit more digging and it seems like one of my files has been corrupted / data is packed incorrectly. Will re-make that file and see if it changes anything. Thanks for your help!

sadielbartholomew commented 1 month ago

OK thanks for trying that Ella. I can't say for sure, but your hypothesis of incorrect packing or corruption of data could well be the case given the context. Sorry if we can't pinpoint anything more specific - it is considerably more difficult to diagnose a precise issue from the C code component of the library (which is mostly for reading UM) than the Python part (the rest of the functionality in cf-python).

If you are still having difficulty getting cf-pytohn to do what you want it to after re-making, please comment here again and we can think what else we can try.