COSIMA / cosima-cookbook

Framework for indexing and querying ocean-sea ice model output.
https://cosima-recipes.readthedocs.io/en/latest/
Apache License 2.0
58 stars 25 forks source link

Add COSIMA metadata to attributes of data returned from getvar #266

Closed aidanheerdegen closed 3 years ago

aidanheerdegen commented 3 years ago

The cookbook has access to metadata with important provenance information. It makes sense to add this metadata to the xarray.DataArray returned by getvar.

Could add a flag to toggle this behaviour (default to adding the metadata), but I'm leaning away from that. It adds complexity to the API and it isn't hard to ignore/delete attributes that aren't required. Including it will only ever be useful for researchers, even if they only realise it a year later when they have some data and can't recall the details of where it came from, but that information is available in the global attributes.

aidanheerdegen commented 3 years ago

Any issues with this @AndyHoggANU, @aekiss or @navidcy?

I think this, in addition to #262 will help a lot with provenance and allow end users to understand better where their data came from and how it was constructed.

AndyHoggANU commented 3 years ago

I don't think I would have any issues -- as long is it doesn't slow down variable retrieval? Provenance always good.

aidanheerdegen commented 3 years ago

Won't make any difference to variable retrieval but I'll check.

navidcy commented 3 years ago

The cookbook has access to metadata with important provenance information. It makes sense to add this metadata to the xarray.DataArray returned by getvar.

This makes absolute sense, of course. I don't understand why it's something worth debating for..

aidanheerdegen commented 3 years ago

You never know what people might object to. Also prioritising work can help to know if something is considered worthwhile.

navidcy commented 3 years ago

The only concern would be performance, as @AndyHoggANU pointed out.

navidcy commented 3 years ago

But even a slight performance loss is still worth it... Loading variables never takes that long anyways... (imo)