State abstraction - Githubissues

scarrazza commented 3 years ago

Following our discussion, one of the asymmetries in the code is the state representation. We should provide an abstraction layer for the state and math/algebra operations.

stavros11 commented 3 years ago

Regarding the backend abstraction we could possibly follow what they do in google/TensorNetwork. I may be biased because I contributed in this in the past, however I believe their backend scheme is very clean. In our case we could have the following:

class AbstractBackend:

    def __init__(self):
        self.backend = None
        self.name = "abstract"

    @abstractmethod
    def matmul(self, x, y):
        """Matrix multiplication of two rank-2 tensors."""
        raise_error(NotImplementedError)

    @abstractmethod
    def sum(self, x, axis=0):
        """Exact diagonalization of matrices."""
        raise_error(NotImplementedError)

    @abstractmethod
    def einsum(self, *args):
        """Einsum of arbitrary rank tensors."""
        raise_error(NotImplementedError)

    @abstractmethod
    def eigh(self, x):
        """Exact diagonalization of matrices."""
        raise_error(NotImplementedError)

    ...

class NumpyBackend(AbstractBackend):

    def __init__(self):
        import numpy as np
        self.backend = np
        self.name = "numpy"

    def matmul(self, x, y):
        return self.backend.matmul(x, y)

    def sum(self, x, axis=0):
       return self.backend.sum(x, axis=axis)

    def einsum(self, *args):
        return self.backend.einsum(*args)

    def eigh(self, x):
        return self.backend.linalg.eigh(x)

    ...

class TensorflowBackend(NumpyBackend):

    def __init__(self):
        import tensorflow as tf
        self.backend = tf
        self.name = "tensorflow"

    def sum(self, x, axis=0):
        return self.backend.reduce_sum(x, axis=axis)

    ...

qibo_numpy = NumpyBackend()
qibo_tensorflow = TensorflowBackend()

and then use the backends in other modules as

import qibo_numpy as K
z = K.sum(K.matmul(x, y))

This approach may seem a bit redundant because it redefines many methods (it would seem simpler to just do import tensorflow as K), however it has several advantages:

It solves the issue of methods that do not exist in all backends or have different names in each backend. As demonstrated in the example np.sum is equivalent to tf.reduce_sum.
Currently we do import backend as K, import tensorflow as tf, ... across several different module which makes it hard to track which backend methods we actually use. In the above approach, AbstractBackend would hold a complete and well documented list of all backend methods used in Qibo making it easier to define new backends in the future.

Regarding the state abstraction, we currently have the DistributedState object that is returned from distributed (multi-GPU) circuits. This was created to avoid memory issues that arise when merging the state pieces in the end of a distributed simulation. So we have the following asymmetry:

Circuit execution returns a tf.Tensor,
DistributedCircuit execution returns a DistributedState object,
any circuit (distributed or no) with measurements returns a custom object that holds the measurement bitstrings.

which will probably become worse if we add a different backend.

A solution could be to force all circuit executions to return a custom State object which holds the final tensor (whose type is backend dependent) and the measurement results if applicable (that is if the circuit contained measurement gates) and raise errors / return None otherwise.

@scarrazza let me know what you think about these. If you agree, I can open some PRs (preferrably separate for the two issues) with a more concrete proposal.

scarrazza commented 3 years ago

The idea looks good. Does this includes some mechanism to switch backend on the fly?

stavros11 commented 3 years ago

The idea looks good. Does this includes some mechanism to switch backend on the fly?

In principle we could have something similar to our current precision, device and "backend" (custom/einsum) setters, however I did a few quick tests and it seems slightly more complicated. A potential solution would be to do the following in config.py:

import backends # module containing the classes from the previous post

try:
    import tensorflow as tf
    BACKEND = {"module": backends.TensorflowBackend()}
except ModuleNotFoundError:
    # set default backend to numpy if Tensorflow is not installed
    BACKEND = {"module": backends.NumpyBackend()}

def set_backend(backend = "tensorflow"):
    if backend == "tensorflow":
        BACKEND["module"] = backends.TensorflowBackend()
    elif backend == "numpy":
        BACKEND["module"] = backends.NumpyBackend()
    else:
        raise_error(ValueError, "Unknown backend {}.".format(backend))

Then for example in hamiltonians.py or any other file that uses the backend one would have to do:

from config import BACKEND

class Hamiltonian:

    def __init__(self, ...):
        self.K = BACKEND.get("module")

Note that K should be set during initialization, otherwise the switcher will be ignored. Eg if we do

from config import BACKEND
K = BACKEND.get("module")

or even

from config import BACKEND

class Hamiltonian:
    K = BACKEND.get("module")

then it won't work properly.

Another possibility is to just use BACKEND.get("module") instead of K for every call (eg. BACKEND.get("module").matmul) but this is not good for readability.

scarrazza commented 3 years ago

Ok, but then we need some way to change backend after initialization. For example, we could have an __init__.py:

from sys import modules

def select_backend(backend_name='a'):
    mo = modules[__name__]
    if backend_name == 'a':
        from stavros.backendA.interface import set_backend
    else:
        from stavros.backendB.interface import set_backend
    set_backend(mo)

select_backend()

Followed by specific folders per backend, say in this example backendA/interface.py:


def do():
    print('Interface A')

def set_backend(module):
    setattr(module, "do", do)

and backendB/interface.py:


def do():
    print('Interface B')

def set_backend(module):
    setattr(module, "do", do)

This should provide this functionarlity.

stavros11 commented 3 years ago

@scarrazza, thanks for this comment and the idea. I am not sure if I properly understood this but I did a small implementation based on this in #303. Please have a look and let me know if this agrees with what you have in mind.

qiboteam / qibo

State abstraction #300