april-tools / cirkit

a python framework to build, learn and reason about probabilistic circuits and tensor networks
https://cirkit-docs.readthedocs.io/en/latest/
GNU General Public License v3.0
71 stars 1 forks source link

Lift `@final` for RG and SymbC/TensC #213

Closed lkct closed 1 month ago

lkct commented 8 months ago

Currently @final is annotated on the major classes on the pipeline (RG, SymbC, TensC), disallowing them to have subclasses. However, it might be helpful to enable subclassing because users may want to customize their behaviour or extend their functionalities.

The major motivation for using @final in the current implementation is to guarantee the __init__ always follows the predefined logic, but this can be too restrictive. The details on how to relax the restrictions are outlined as follows.


For SymbC/TensC, let's illustrate using SymbC. (The idea for TensC is similar)

The current __init__ interface for SymbC is quite complicated and includes multiple ways of constructing a SymbC, leading to excessive Optional[...] = None annotations. This signature may be too cluttered and not very clear for users how to use it. https://github.com/april-tools/cirkit/blob/1f1af2d36f274d40007af1adf325581dd5670447/cirkit/new/symbolic/symbolic_circuit.py#L27-L40

An alternative way to implement this is to introduce from_* classmethods, each of which provides a way to construct the circuit with some high-level settings. (PS: better not to be staticmethod unless subclasses should have exactly the same behaviour.) On the other hand, __init__ will only handle construction from some low-level representations, which is used by the from_*s.

In this way, users can easily subclass the SymbC, and from_*, __init__, and "other methods" can each be modified standalone based on the need. It's easy for us to add new from_* methods if find useful, without needing to complicate __init__.


RG is slightly different from circuit classes, in that it does not have a method that does the whole construction process, but has several methods (__init__, freeze and several others) that handle the construction as a whole. This is because different algorithms construct the RG in quite different ways, and nothing can be provided to the __init__ at the beginning. The @final is also used to hint that RGs constructed by different algorithms (PD, QT, ...) are all RGs, while we can also construct RGs not from the algorithms. From the view of json loading, it does not matter which algo the RG comes from, only the structure itself matters. Thus, we only have one RG class instead of a class for an algo (e.g. class PD, class QT, etc).

Similar to the above for circuits, for RG we can also introduce from_algorithm, which also makes it possible to merge all the construction into __init__ (no need for freeze/is_frozen) which is guaranteed to see a low-level input. (Though a counterargument can be: partially constructed RG is still a "maintained" container for RGNodes, easier for the use of algorithms than raw python containers.)

Meanwhile, the algorithms, which are currently functions that produce RG, can become implementations of a RGAlg abstract class, which can be passed to from_algorithm.