Open randomir opened 5 years ago
The point here being -- instead of (re)inventing custom serialization schemas, we should use something like this that works with NumPy out of the box, it's fast, supports other custom datatypes through extensions and has libraries for languages we use.
One downside is it has libraries for only Python and C++. But at least the standard is backed by a rather large org.
This is pretty cool!
Would the idea be to make asdf a dependency of dimod, e.g.
BQM.to_asdf()
or to just dump to an asdf compatible format e.g.
af = asdf.AsdfFile(BQM.to_serializable())
No, actually the idea was for dimod objects to expose to_dict
method which would return a "tree" (in Asdf terminology), i.e. a dict with NumPy objects in it. Dual method, from_dict
, would accept the same.
Serialization would be handled on a lower level, closer "to wire". And Asdf would be used there only.
(I've been advocating this approach for serialization since we first talked about it, and I guess to_serializable
comes quite close. Notable distinction is: it still goes through additional effort of serializing ndarray
to list
/bytes
. With Asdf, dimod doesn't have to get its hands dirty with serialization of "standard" data types; where I consider numpy.ndarray
becoming increasingly "standard" in modern Python).
ASDF - Advanced Scientific Data Format looks promising for serialization in Ocean (BQM/SampleSet in dimod's case).
It has the following features:
Libraries for python and c++ are available.