Qiskit / qiskit

Qiskit is an open-source SDK for working with quantum computers at the level of extended quantum circuits, operators, and primitives.
https://www.ibm.com/quantum/qiskit
Apache License 2.0
4.85k stars 2.29k forks source link

Reorganize circuit elements class hierarchy #5811

Open 1ucian0 opened 3 years ago

1ucian0 commented 3 years ago

The things that can be added in a circuit (let's call them CircuitElement for now) are growing a lot and some reorganization is needed. This discussion issues is to talk about that reoganization. I'll try to keep the first comment updated as the discussion progress. Also, if you have the privileges, you can edit this directly.

CircuitElement

Root abstract class.

OpaqueInstruction(CircuitElement)

Not reversible.

Reset(OpaqueInstruction)

Instruction(OpaqueInstruction)

Circuit(CircuitElement)

Implements current circuit. Not necessarily reversible.

Gate(Circuit)

Implements current gate. Reversible.

ControlledGate(Gate)

Directive(CircuitElement)

Barrier(Directive)

kdk commented 3 years ago

Thanks @1ucian0 for starting the discussion on this. This has been a long while coming, and what's above is a good description of the structure where things currently stand.

A good place to start the discussion is collecting the use cases and behaviors we want to support, and designing roles/mixins around those (similar to what's been done recently with Operators). Then we can focus our design on how Terra expects to consume these objects without being too prescriptive on how they are implemented or extended by users.

For basic instruction on the circuit:

For circuit building:

Serialization:

Transpilation:

And likely there are more that I missed.

Along with this redesign would need to come a change in the transpiler to by default, leave in place, operations it does not know how to operate over.

Likewise, it'd be good to collect a set of use cases that we know are challenging with our current structure, to see that they're better supported in the future.

The two that immediately come to mind are:

ajavadia commented 3 years ago

some earlier related issues: https://github.com/Qiskit/qiskit-terra/issues/4700, https://github.com/Qiskit/qiskit-terra/issues/3800 also i made a comment about class names here:https://github.com/Qiskit/qiskit-terra/issues/5774

kdk commented 2 years ago

After some renewed interest from the ongoing work in https://github.com/Qiskit/qiskit-terra/labels/synthesis , it looks like there are two issues at play here:

These can be addressed independently, and so the below focuses only on the latter.

Currently, higher-level operations can be appended to circuits in one of two ways:

From the perspective of #4700, the former has limitations in that it:

The latter does support #4700, but with constraints that:

Ameliorating the above (through design, standardization and documentation) is a viable option for supporting #4700 , and should be considered.

An alternate approach has been discussed around using mixins as a means to standardize the interface for the operations that can be placed in a quantum circuit. Mixins in python (also called traits or roles) are a limited form of multiple inheritance which define a limited interface of behaviors (not overlapping with other mixins) which any class can implement and satisfy. (More detail on the rationale behind mixin usage here: https://www.python.org/dev/peps/pep-3133/#rationale )

As compared to the Instruction sub-class approach above, mixins:

Mixins are not a built-in python language feature, and so require some care to ensure the pattern is followed (e.g. ensure no mixin defines an __init__, and none have overlapping property or method names). Its possible this can be addressed with lint rules, or may not arise in practice without a large number of mixins. quantum_info.operator has used this pattern since #5617 and may have some relevant experience.

Some rough pseudocode of how a mixin pattern might look:

from abc import ABC, abstractmethod

class CircuitOperationMixin(ABC):
    @property
    @abstractmethod
    def name(self):
        """Unique string identifier for operation type. Used by the circuit and
        others to collect common operations which differ only in their operands
        (e.g. for resource estimation, serialization, and visualization).
        """
        raise NotImplementedError

    @property
    @abstractmethod
    def num_qubits(self):
        """The number of quantum bit operands required by the operation (in the
        circuit model, this encompasses both input and output). Used by the circuit
        in tracking and ordering quantum resources e.g. for calculating of depth.
        """
        raise NotImplementedError

    @property
    @abstractmethod
    def num_clbits(self):
        """The number of classical bit operands required by the operation (in the
        circuit model, this encompasses both input and output). Used by the circuit
        in tracking and ordering classical resources e.g. for calculating of depth.
        """
        raise NotImplementedError

    @property
    @abstractmethod
    def num_params(self):
        """The number of python operands required by the operation. The number
       and type of these operands will in general be dependent on the operation
       type. These operands must be resolved prior to device execution. Used by
       the circuit in binding and assignment of parameters.
        """
        raise NotImplementedError

    @property
    @abstractmethod
    def params(self):
        """Returns a python tuple of the python operands currently assigned to
        this operation.
        """
        raise NotImplementedError

    @params.setter
    @abstractmethod
    def params(self):
        """Accepts a python tuple of python operands to be assigned to this operation.
        Performs and necessary validation and updates to internal data structures.
        """
        raise NotImplementedError

...

class Clifford(BaseOperator, AdjointMixin, CircuitOperationMixin):
    ...

    @property
    def name(self):
        return 'clifford'

    # implements @property num_qubits through BaseOperator

    @property
    def num_params(self):
        return 1

    @property
    def num_clbits(self):
        return 0

    @property
    def params(self):
        return (self._table,)

class Instruction(CircuitOperationMixin):  # Otherwise unchanged
    ...

Classes which implement the CircuitOperationMixin interface are valid to add to a QuantumCircuit, and so QuantumCircuit.append would be updated to instead check isinstance(instruction, CircuitOperationMixin) and we could deprecate and remove:

        if not isinstance(instruction, Instruction) and hasattr(instruction, "to_instruction"):
            instruction = instruction.to_instruction()
jakelishman commented 2 years ago

Mixins generally imply multiple inheritance, and multiple inheritance doesn't play nicely with __slots__ in the general case. We construct a lot of circuit element objects, and so for memory and performance reasons, it's probably going to be better for us to have slots on them all; they'll be faster to initialise, take less memory, and operations on them are faster. I actually was interested in removing the mixins from quantum_info, in the interests of speeding up various transpiler passes that need to construct many Operator instances, since the initialisation is slow and memory-hungry in the current form.

I'd want to avoid having Instruction and Operator having any ABC instantiation checks in the classes if at all possible as well. Perhaps we could manage these properties through a metaclass, which takes a bunch of API definition classes as arguments, then defines any necessary __slots__, and checks that all necessary methods are defined at class-definition time, rather than at object instantiation time?

I'd imagine we'd end up with initialisers that looked like

class XGate(metaclass=TraitManager([CircuitOperation, SelfInverse, SelfAdjoint])):
    ...

class Clifford(OperatorBase, metaclass=TraitManager([CircuitOperation, Adjoint, Inverse])):
    ...

or so. It's not immediately clear to me when we'd use regular inheritance (Clifford(OperatorBase) v XGate()) in conjunction with any managed trait system, but the metaclass machinery can access the bases anyway, so that isn't a limitation. The TraitManager metaclass can enforce that all abstract methods are defined at class initialisation time, so we can avoid all of abc.ABC's costly intialisation checks, but still get the benefits with regards to safety.

I have the rough sketch of how to do this in my head, but I don't want to commit it to code here in the issue, because I imagine there's a bunch of rough edges (subclass instantation, error handling, ergonomics of implementing the "does this object have this trait?" checks) that I'd need to think about more to do properly. I can imagine nice ways to handle all of them, but it'd need some testing to be sure.

kdk commented 2 years ago

For the caveats with using __slots__ and multiple inheritance, I came across https://docs.python.org/3/reference/datamodel.html#notes-on-using-slots , from which:

  • The action of a slots declaration is not limited to the class where it is defined. slots declared in parents are available in child classes. However, child subclasses will get a dict and weakref unless they also define slots (which should only contain names of any additional slots).
  • If a class defines a slot also defined in a base class, the instance variable defined by the base class slot is inaccessible (except by retrieving its descriptor directly from the base class). This renders the meaning of the program undefined. In the future, a check may be added to prevent this. ...
  • Multiple inheritance with multiple slotted parent classes can be used, but only one parent is allowed to have attributes created by slots (the other bases must have empty slot layouts) - violations raise TypeError.

seem relevant (though I'm not exactly sure how to parse the conditions and consequences of the second bullet). The last does seem fairly incompatible with use of __slots__ within both the object classes and mixins. (As a potential near term workaround, if the child class explicitly defines all the __slots__ of its parents, would this sidestep this problem?) The TraitManager metaclass approach seems a reasonable solution to me (though agree one that would require some thinking and testing to avoid rough edges). Would we need something like a TraitManger in-place before starting a potential move towards mixins? If not, maybe a quick POC would be worthwhile to potentially uncover any incompatibilities that might deter us from taking the mixin route.

I'd want to avoid having Instruction and Operator having any ABC instantiation checks in the classes if at all possible as well.

Can you clarify what's meant by ABC instantiation checks here? e.g. If I define a new class that's a subclass of abstract class which itself subclasses ABC, is there an added cost on instantiating new instances of the child class?

I tried the following:

from abc import ABC, abstractmethod
class B():
    def foo(self):
        return 3

class Parent_abc(ABC):
    @abstractmethod
    def foo(self):
        return 3

class Babc(Parent_abc):
    def foo(self):
        return 3

and got the following (python 3.6):

%%timeit
b = B()
101 ns ± 10.8 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

%%timeit
b = Babc()
103 ns ± 2.84 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

which seem comparable. Creating classes which subclass ABC does seem slower though:

%%timeit
class Cabc(Parent_abc):
    def foo(self):
        return 3
23.4 µs ± 2.16 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%%timeit
class Cabc():
    def foo(self):
        return 3
9.89 µs ± 572 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
jakelishman commented 2 years ago

Slots

Yeah, the last bullet point in that list from the Python docs was what I was alluding to with mixins being a nuisance with slots. I'm pretty sure it's possible to write a metaclass to construct a __slots__ class with multiple inheritance, by scanning over the derived types and building up the __slots__ field manually, before it's passed to type to construct. Class decorators can't do that in the same way, because they require that the class is first constructed, then the bound name is modified, so you end up "intercepting" the creation at the wrong point.

You can define slots that are already defined by a parent, currently, and nothing will complain, but it's wasteful of space. It shouldn't have speed implications, except for indirect ones caused by less efficient cache use. The reason is related to that second bullet point you weren't sure about (hidden in the expander).

Derailing technical details about __slots__... Take, for example: ```python In [35]: class Parent: ...: __slots__ = ('a',) ...: class Override(Parent): ...: __slots__ = ('a',) ...: class Inherit(Parent): ...: __slots__ = () ...: In [36]: import sys ...: sys.getsizeof(Override()), sys.getsizeof(Inherit()) Out[36]: (48, 40) ``` I believe this is because at the C level, space for a `PyObject *` corresponding to each item defined in `__slots__` is reserved on instantiation, and then the object `Parent.a` is actually a descriptor (like how `@property` works), which fetches the value for an instance from a fixed, known pointer offset in memory. In contrast, `__dict__` lookups have an extra step, where they first need to look up the offset of a given name in a hashmap, _then_ they check if the resulting object is a descriptor, and then they return the value. This means that in the case here, `Override.a` is actually a `@property`-like getter that's distinct from `Parent.a`, whereas `Inherit.a` is referentially _exactly_ `Parent.a`: ```python In [5]: Override.a Out[5]: In [6]: Inherit.a Out[6]: ``` So now there's more space than expected on `Override` objects. I can't do `Override.nonexistent = None` without getting an AttributeError, but I _can_ do ```python In [7]: overridden = Override() ...: overridden.a = 'child' ...: Parent.a.__set__(overridden, 'parent') ...: overridden.a Out[7]: 'child' In [8]: Parent.a.__get__(overridden) Out[8]: 'parent' ``` where I've managed to store an extra Python object in `overridden`, that just by looking at its `__slots__`, I shouldn't have been able to. If I try the first cell with `Inherit`, you can see that the extra space isn't there: ```python In [9]: inherited = Inherit() ...: inherited.a = 'child' ...: Parent.a.__set__(inherited, 'parent') ...: inherited.a Out[9]: 'parent' ``` This is what's meant by "the instance variable defined by the base class slot is inaccessible (except by retrieving its descriptor directly from the base class)" - `Parent.a` is the base class descriptor, and here I've used it to access a previously hidden value in `overridden`.

ABC

My timing concerns on ABC are mistaken - I hadn't actually checked, and I had a false model in my head for how ABCMeta goes about its business. I have no more concerns about using ABC, but we do know that using @property has non-trivial speed impacts on fetching attributes. In cases where we just want to define that an attribute exists (a member, rather than a method), we can use __slots__ instead, if we have something like a trait-manager metaclass that correctly builds the type from multiple inherited slots classes.

"TraitManager"

This actually feels to me like something that should be an external library, were we to have it. It may even already exist - I'll have a better look tomorrow, and if not, I'll have a think about writing one. Our use-case is more about static type creation, rather than dynamic trait look-up, though, so it's quite un-Pythonic (but then again, core speed/memory optimisations often are).

eliarbel commented 2 years ago

Perhaps I'm missing here something but it seems to me that slots and mixin are orthogonal, at least in our discussion here. The use if mixin here is to signify a role (CircuitElement) for certain (albeit many) classes, via an interface commitment. I don't see how slots, being an optimization for handling class attributes more effectively, help us with having interface and functionality commitments for circuit elements. Am I missing here something ?

If run-time efficiency is the main concern for not using mixin and preferring slots and metaclass machinery, I suggest that maybe we do sufficient benchmarking for we can wisely trade off performance with other important factors like readability and extendability.

jakelishman commented 2 years ago

My concern was all about not letting mixins prevent us from using __slots__ and suppressing the creation of __dict__ in downstream classes. I wanted to have both interface definitions and __slots__, but I thought that defining attribute interfaces as property and abstractmethod would require those to be property in child classes (to satisfy abc.ABC), while actually that's not the case. Coming back to it again today, there is no reason you can't use mixins with property and abstractmethod to define an interface, and have each child class just manually override all the slots:

In [9]: import abc
   ...:
   ...:
   ...: class Mixin(abc.ABC):
   ...:     __slots__ = ()
   ...:
   ...:     @property
   ...:     @abc.abstractmethod
   ...:     def a(self):
   ...:         pass
   ...:
   ...:
   ...: class Child(Mixin):
   ...:     __slots__ = ('a',)
   ...:
   ...:     def __init__(self):
   ...:         self.a = 3
   ...:

In [10]: Child().a
Out[10]: 3

In [11]: Child().__dict__
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-12-9c959507972c> in <module>
----> 1 Child().__dict__

AttributeError: 'Child' object has no attribute '__dict__'

There's a minor downside in that the derived classes have to write all their own __slots__ definitions for attributes in the interfaces (like my Child had to have __slots__ = ('a',)), but that's not really the end of the world, even if there is a fair amount of duplication. We just have to ensure that all our mixins correctly set __slots__ = ().

There are already timing nuisances in some transpiler passes because some of the quantum_info classes (Operator in particular) are heavier than they need to be, slower to initialise, and we instantiate millions+ of them. The speed is in part because of how they use mixins with attributes defined in them, without the mixins having __slots__ = (), and Operator doesn't define its slots. That's why I really wanted to make sure we wouldn't be introducing new problems with mixins here - I've absolutely nothing against defining interfaces (that's absolutely the right thing to do), I just wanted to make sure that it wouldn't interfere with speed/memory concerns.

ajavadia commented 2 years ago

I very much endorse these requirements from @kdk:

  • don't require higher-level operations to build separate Instruction sub-classes or to adhere to the Instruction class hierarchy in order to be added to a circuit (Operators, or Multipliers can have their own class structure, or not, independent of that of Instruction),
    • facilitate inspection of the high-level operations by the transpiler (as the object itself is contained on the circuit),
    • allow for the standardization and documentation of the interface of what's required for a given operation to be stored on a circuit (some of this is currently encoded in the Instruction base class, some is tribal knowledge e.g. how to have a variadic gate type).

But I don't have strong opinions on the use of mixins vs. trait manager.

jakelishman commented 2 years ago

I think forget about this idea of a trait manager now - it's not necessary, and we can do things efficiently without it. I also like the interfaces method Kevin's describing there.

1ucian0 commented 2 years ago

Is https://github.com/Qiskit/qiskit-terra/issues/4313 superseded by this one?