Closed ThibHlln closed 2 years ago
As per David's suggestion (https://github.com/NCAS-CMS/cf-python/issues/365), it makes more sense for unifhy
to manually store the filenames right after the call to cf.read
.
And it turned out, it was already the case: https://github.com/unifhy-org/unifhy/blob/c4e235cf923778d8ea78f9155f8a9d6b03bf1414/unifhy/data.py#L172-L178
But then, Component
is manipulating the fields contained in DataSet
in such a way that cf
will drop the filenames along the way. In a couple of places later in the workflow, filenames are retrieved from the field directly (i.e. using Field.get_filenames()
method) rather than from the variable (i.e. using Variable.filenames
attribute):
https://github.com/unifhy-org/unifhy/blob/c4e235cf923778d8ea78f9155f8a9d6b03bf1414/unifhy/component.py#L714
https://github.com/unifhy-org/unifhy/blob/c4e235cf923778d8ea78f9155f8a9d6b03bf1414/unifhy/component.py#L744
This needs to be fixed by using the Variable
attribute instead of the Field
method.
There seems to be a bug in
unifhy
when the files contained in theunifhy.DataSet
are small enough so that they can fit in memory. This seems to be linked to the documented behaviour ofcf.Field.get_filenames
:This results in the filenames attribute of a given
unifhy.Variable
to be an empty set. Ultimately leading to saving an empty sequence of filenames in the YAML file, so that ato_yaml
>from_yaml
workflow fails.It would be good to check with
cf-python
whether there is another functionality that keeps track of filenames, or if it makes sense for their package to offer such functionality. If not, it will be up tounifhy
to keep track of them.