dwavesystems / dimod

A shared API for QUBO/Ising samplers.
https://docs.ocean.dwavesys.com/en/stable/docs_dimod/
Apache License 2.0
124 stars 81 forks source link

ASDF serialization #513

Open randomir opened 5 years ago

randomir commented 5 years ago

ASDF - Advanced Scientific Data Format looks promising for serialization in Ocean (BQM/SampleSet in dimod's case).

It has the following features:

Libraries for python and c++ are available.

randomir commented 5 years ago

The point here being -- instead of (re)inventing custom serialization schemas, we should use something like this that works with NumPy out of the box, it's fast, supports other custom datatypes through extensions and has libraries for languages we use.

One downside is it has libraries for only Python and C++. But at least the standard is backed by a rather large org.

arcondello commented 5 years ago

This is pretty cool!

Would the idea be to make asdf a dependency of dimod, e.g.

BQM.to_asdf()

or to just dump to an asdf compatible format e.g.

af = asdf.AsdfFile(BQM.to_serializable())
randomir commented 5 years ago

No, actually the idea was for dimod objects to expose to_dict method which would return a "tree" (in Asdf terminology), i.e. a dict with NumPy objects in it. Dual method, from_dict, would accept the same.

Serialization would be handled on a lower level, closer "to wire". And Asdf would be used there only.

(I've been advocating this approach for serialization since we first talked about it, and I guess to_serializable comes quite close. Notable distinction is: it still goes through additional effort of serializing ndarray to list/bytes. With Asdf, dimod doesn't have to get its hands dirty with serialization of "standard" data types; where I consider numpy.ndarray becoming increasingly "standard" in modern Python).