google / TensorNetwork

A library for easy and efficient manipulation of tensor networks.
Apache License 2.0
1.8k stars 355 forks source link

Serialization and storage of Tensor Network objects #879

Open matthewweippert opened 3 years ago

matthewweippert commented 3 years ago

I'd love to see some methods for serialization and storage of matrix product operators and states. In the short term, I was hoping to use pickle and/or dill, but both are giving me problem:

import tensornetwork as tn
#import dill as pickle
import pickle
import numpy as np

m, d, N = 3, 5, 10
mps_orig = tn.FiniteMPS.random([d]*N, [m]*(N-1), dtype=np.complex128)

datafile = 'testfile.dill'
with open(datafile, 'wb') as df:
    pickle.dump(mps_orig, df)

with open(datafile, 'rb') as df:
    mps_read = pickle.load(df)

print(mps_orig.backend is mps_read.backend)
print(mps_orig.backend == mps_read.backend)

print(mps_orig.backend)
print(mps_read.backend)

I expected to see True and True and the same name for the numpy (default) backend twice. Python's pickle gets hung up on svd (maybe the first jitted function?): AttributeError: Can't pickle local object 'BaseMPS.__init__.<locals>.svd'

With dill (uncomment line 2 and comment line 3) I get

False
False
<tensornetwork.backends.numpy.numpy_backend.NumPyBackend object at 0x000002BD9FE12D00>
<tensornetwork.backends.numpy.numpy_backend.NumPyBackend object at 0x000002BDA1B272B0>

In the long run, I'd prefer more efficient and stable storage to disk using h5py. However, we may also want to use something like dask for distributed processing which uses pickle for communication.

matthewweippert commented 3 years ago

This may be related to #206 and #214 .

mganahl commented 3 years ago

Hi @matthewweippert! The issue with pickle here is that the FiniteMPS class has a bunch members which are inline-defined functions, and svd is one of them. The reason for this is indeed related to jiting. Let me see if I can fix this.