Closed callumrollo closed 1 year ago
@callumrollo the part in the code that writes the data is in https://github.com/ioos/ioos_qc/blob/121620a2ac8955885db0c7b0a06e8b6b95a41ac3/ioos_qc/stores.py#L157-L240
However, as I read it, I believe that the API there is sane enough b/c you are creating a new file with new old+data and new attributes. It would be the the end user responsibility to re-write the original attributes that still apply into that dict
. Maybe Kyle disagrees, let's wait for his input.
With that said, the first option, to add the QA/QC to an xarray Dataset, would preserve everything and you can save a netCDF file from it. Would that work for you?
PS: IMO we should not even have saving mechanism when libraries like xarray exists. We should outsource functionality to it and only return the QA/QC results.
Also, looks like at the end the xarray from_dataframe
method is used to pass the attributes but that method doesn't take an attribute (at least not in the latest xarray version). So all these attributes are not passed anyway :-/
Edit: Nope, it is from pocean-core.
Thanks Filipe. I've opted to just use Option A to run the QC routines manually and handle the netCDF writing via xarray. imo the example notebooks should not recommend the .write method in Option C.
mo the example notebooks should not recommend the .write method in Option C.
Indeed those notebooks require a much an overhaul. We need users, like you, and a doc sprint to start addressing that.
Should've seen that one coming :P I expect I'll end up addressing #58 in the next few weeks. I can write it up into a nice notebook
Should've seen that one coming :P I expect I'll end up addressing #58 in the next few weeks. I can write it up into a nice notebook
If we ever meet in person I'll pay in :beer: or :coffee:.
Also, looks like at the end the xarray
from_dataframe
method is used to pass the attributes but that method doesn't take an attribute (at least not in the latest xarray version). So all these attributes are not passed anyway :-/
That isn't an xarray.from_dataframe
, it's using a DSG classes' from_dataframe
method from pocean
. The class you want is passed in as an argument to the save
method. These do take an NCO-JSON compliant dict and applies it to netCDF file.
I'm open for the API to be whatever users want it to be! The reason the code was split into streams
and stores
is to we can access data from anywhere and save it anywhere... we just need to implement the classes as they become use-cases. I will say that I use PandasStore
for collecting the QC results and manually write the data out to different formats as needed.
CFNetCDFStore
was meant to write to a new file
NetcdfStore
was meant to append/update an existing file
I can see a use for XarrayStore
that returns an in-memory (or lazy) xarray.Dataset
of the QC results that a user can then choose to combine with another xarray.Dataset
of the original data. If they have the same coords it's a one-liner in xarray
would be a really clean API.
I can see a use for
XarrayStore
that returns an in-memory (or lazy)xarray.Dataset
of the QC results that a user can then choose to combine with anotherxarray.Dataset
of the original data. If they have the same coords it's a one-liner inxarray
would be a really clean API.
I like this idea.
That isn't an
xarray.from_dataframe
, it's using a DSG classes'from_dataframe
method frompocean
. The class you want is passed in as an argument to thesave
method. These do take an NCO-JSON compliant dict and applies it to netCDF file.
I saw that right after I wrote the comment.
I'm trying to set up QARTOD QC for glider trajectory netCDF files, similar to the Issue #58 raised by Kerfoot. I would like to read in a netCDF, apply QARTOD flags and write out the netCDF.
The examples with netCDF silently drop all the variables from the variables being flagged, and also drop attributes from the Coordinates. In the example QartodTestExample_netCDF.ipynb the input nc has quite comprehensive atributes:
After QC is applied, and the result is saved to nc, these attributes have been erased
It seems odd, as the flag variables are saved with metadata
Is this desired behaviour? Is there a way to retain attributes in variables when saving to nc?