arviz-devs / InferenceObjects.jl

Storage for results of Bayesian inference
https://julia.arviz.org/InferenceObjects
MIT License
14 stars 1 forks source link

Changing representation of attributes #9

Closed sethaxen closed 2 years ago

sethaxen commented 2 years ago

Currently we implement attributes as an OrderedDict{:Symbol,Any}. This has 2 problems. First, when serialized to Zarr or NetCDF or when sent to Python, the symbols are converter to strings and must be converted back when building a new InferenceData. Also, this aspect of the schema is not followed by Python ArviZ and will likely be removed due to lack of necessity (see https://github.com/arviz-devs/arviz/issues/2064#issuecomment-1224952818).

So the new proposal is to encode attributes as a Dict{String,Any}. Alternatively, we use Dict{Any,Any} and call string on keys before serialization if needed.

sethaxen commented 2 years ago

NCDatasets constraints attribute keys to be Union{AbstractString,Symbol} and the values to be scalars or vectors. When writing, the symbol is converted to a string.

Zarr allows any type to be a key, but it is converted to a String upon writing. The same is true for the values, where it Zarr recursively converts all values to dicts whose keys are the fields of the original type.

Given these constraints, the simplest thing for us to do right now is to make the metadata to be a Dict{String, Any}.