Revert ecosystem dependencies

qiboteam / qibo

A framework for quantum computing

https://qibo.science

Apache License 2.0

288 stars 60 forks source link

Revert ecosystem dependencies #1117

Open alecandido opened 10 months ago

alecandido commented 10 months ago

Currently, the dependencies in the Qibo ecosystem are the following:

flowchart LR
  qibocal --> qibolab --> qibo
  qibojit --> qibo

But then the users will need a separate installation of all the various tools, even just to run circuits. Moreover, we usually leave to the user the burden of installing dependencies for the backends (including TensorFlow and NumPy). I will open a separate issue for that.

A better organization could be the following:

flowchart LR
  qibocal --> qibolab
  qibo --> qibolab --> qibo-core
  qibo --> qibojit --> qibo-core
  qibo --> qibo-core

Indeed, the user will often interact just with qibo (as frontend), or qibocal, for different tasks. Everything else is just going to implement backends.

In particular, qibo-core will only contain the current qibo.backends and qibo.gates, since they are the building blocks used by the backends.

Other relevant points:

we can have a new repo for the core, but we don't have to (we could even keep here, it's just a matter of reshuffling workflows), but
we will need separate PyPI packages, for the proper dependency management
we do not need a breaking release (i.e. Qibo 0.3), since we could keep exposing everything through as before (just exporting from qibo.core).

hay-k commented 8 months ago

Assuming that the main dependencies are backends and gates, I would propose another resolution:

flowchart LR
  qibo --> qibo.core
  qibojit --> qibo.core
  qibocal --> qibolab --> qibo.core

The idea is roughly the following

The current dependency structure is already pretty good, no need to invert it. We just need to refine it. In fact, the current structure is already in the spirit of the dependency inversion principle: the abstract interface (backends) is defined by the higher level (qibo), and is implemented by the lower levels (e.g. qibolab, qibojit).
The drawbacks of the current dependency structure are:
- To import the interface, qibolab, qibojit etc. have to install entire qibo as dependency. This is addressed by creating the separately installable qibo.core module as you already suggested.
- qibolab and qibocal import a whole bunch of things from qibo, however most of it seem to be primitive things that do not need to come from qibo, e.g. logger, raise_error, etc. Hopefully, the dependency can be stripped down to backends and gates only (I didn't check all imports one by one, so I am not sure if there is anything else important). This shall be solved by cleaning up the code. In fact this can be done separately as a first step.
User directly decides what to install and what not. E.g. they decide to install qibojit along with qibo. In this case concrete backend instances can be supplied to qibo in a dependency injection manner.
qibo itself can have its own backend implementations.

alecandido commented 8 months ago

The current dependency structure is already pretty good, no need to invert it.

Well, it's partially inverted even in your diagram, since no one would depend on Qibo any longer...

Btw, sorry for the misleading part, but qibo.core is not intended to be part of Qibo, I will update the OP to qibo-core.

alecandido commented 8 months ago

Hopefully, the dependency can be stripped down to backends and gates only (I didn't check all imports one by one, so I am not sure if there is anything else important).

Yes, that's the main goal. Better stated this way.

User directly decides what to install and what not. E.g. they decide to install qibojit along with qibo. In this case concrete backend instances can be supplied to qibo in a dependency injection manner.

I'd like it as well, but I refrained from proposing it. This is more "developer mode", in which you know what you need. Instead, I wanted qibo to be a hat on top of everything. It could even be a metapackage, eventually, just depending on the other packages, but containing no code (e.g. qiboinfo will contain all th Qinfo stuffs, qiboml all the variational, and qibo-models all the pre-built objects). However, no need to go this far in one shot.

So, to me, it might be a good idea that pip install qibo comes batteries included, unless there are dependencies that are not available on all platforms (or heavy ones, like TensorFlow), in which case they could under suitable extras.

However, the two diagrams are almost the same, yours has two arrows less, so I believe we are almost on the same page.

alecandido commented 8 months ago

qibo itself can have its own backend implementations.

About this, I just opened a relevant issue, essentially lifting a discussion that was already mentioned here (or better, that was mentioning this issue, and thus displayed by GitHub).

https://github.com/qiboteam/qibo/issues/1222

So, I would implement:

the AbstractBackend necessarily in qibo-core
the NumpyBackend possibly in qibo-core as well
the TensorflowBackend (and whatever other ML based one) in qiboml, if ever we will split it @MatteoRobbiati
the CliffordBackend in the Qinfo module, if ever we will split it @BrunoLiegiBastonLiegi @renatomello

What do you think @stavros11? (and @hay-k of course, but you were already involved in the discussion)

hay-k commented 8 months ago

Well, it's partially inverted even in your diagram

It is not inverted. If thing A depends on thing B in the current structure, this relation remains the same in my diagram. E.g. qibolab depends on backends currently, and it does so in the proposed diagram. It is pretty much the same in your proposal, except that you introduce additional dependency from qibo to others, that is an inversion compared to current state. My argument is basically that this additional complexity is not needed.

Of course all this depends on the definition of what qibo is. If it is supposed to become a no-code metapackage as you describe, then ok, it can depend on everything. However, this does not automatically solve the problem under discussion here. The circuit execution functionality goes somewhere (let's call it qiboc), and this somewhere should still respect the standard structure I propose. e.g. it should be the following (if you remove qibo and rename qiboc to qibo it becomes the diagram I proposed)

flowchart LR
  qibo --> qiboc --> qibo.core
  qibo --> qibojit --> qibo.core
  qibocal --> qibolab --> qibo.core
  qibo --> qibolab

and not the following (if you remove qibo and rename qiboc to qibo it becomes the diagram you proposed)

flowchart LR
  qibocal --> qibolab --> qibo.core
  qibo --> qiboc --> qibo.core
qiboc --> qibojit
  qibo --> qibojit --> qibo.core
  qibo --> qibolab
  qiboc --> qibolab

This is more "developer mode", in which you know what you need.

Well, it shall not be any different from current ways of using qibo - users know that there are backends, users knows how to choose a specific backend. User knows that for specific backend like qibojit they have to install an additional package, etc. In general, I think it is a fair expectation that users know what they need.

What do you think @stavros11? (and @hay-k of course, but you were already involved in the discussion)

I am not that familiar with all the listed backends, so not much to comment, however I fully support the idea of avoiding a design where everybody inherits from NumpyBackend. I have the impression that this common inheritance pattern exists because AbstractBackend defines too much functionality, some of which can be implemented elsewhere and not as part of backends. So I would suggest to look into this from this angle as well.

alecandido commented 8 months ago

It is not inverted. If thing A depends on thing B in the current structure, this relation remains the same in my diagram. E.g. qibolab depends on backends currently, and it does so in the proposed diagram.

It's a perspective issue. This is already an inversion, and it's the core of the proposal: the backends are in qibo, so everything depends on qibo, but qibo itself is the/a user-facing package, so it can't be the foundation for everyone else (if you think qibo-core as the new qibo, then there is no inversion, but if the qibo identity is retained by what we call qibo, then there will be). But I don't care much about the naming, if you agree with the concept, that's enough.

It is pretty much the same in your proposal, except that you introduce additional dependency from qibo to others, that is an inversion compared to current state.

That is the second half, yes. However, I'm much more keen in the first one. Though, I explained the motivations for this.

Of course all this depends on the definition of what qibo is.

Agreed

The circuit execution functionality goes somewhere (let's call it qiboc)

I'm not sure what you mean with circuit execution functionality: to me, the circuit executors are the backends...

I still believe we agree about the main point. The main discrepancy is caused by the presence of qiboc, since I'm not sure what is supposed to do (and if you remove it, the two diagrams in your last comment are just the same).

alecandido commented 8 months ago

I am not that familiar with all the listed backends, so not much to comment, however I fully support the idea of avoiding a design where everybody inherits from NumpyBackend. I have the impression that this common inheritance pattern exists because AbstractBackend defines too much functionality, some of which can be implemented elsewhere and not as part of backends. So I would suggest to look into this from this angle as well.

I partially agree with this, but on the other side, I'm not too much against the broad scope of backend.

What I have in mind is that, if we trimmed down the functionalities of the backend, but part of those functions could have a different implementation at least for one backend, then we would end up implementing a backend mechanism once more somewhere else.

So, in my mind, the backend should expose every function that could have multiple implementations (and it's used for Qibo operations). Where multiple does not mean exactly the number of backends available, but just > 1.

However, since most of these functions are frequently not performance-critical, and often good enough in their NumPy flavor, I'd just provide a default implementation of them. (I.e. I'm open to implement in the AbstractBackend everything but execute_circuit, in principle - but there could be something else that is also worth to keep abstract)

But let's discuss this in #1222.

stavros11 commented 8 months ago

flowchart LR
  qibo --> qibo.core
  qibojit --> qibo.core
  qibocal --> qibolab --> qibo.core

I like this proposal more, with some minor remarks:

If Circuit is part of qibo and not qibo-core (which should be the case), then qibocal should also depend on qibo because Circuits are used in various routines (RB).
The biggest part of qibolab (pulses, platforms, qubits, channels, drivers), which is also the part most relevant for labs, is independent of qibo, even qibo-core, and it could stand alone as a seperate project. Unless I am missing something, the only parts that depend on qibo-core is the compiler (depends on qibo.gates) and the backend (inherits AbstractBackend), basically the things more relevant to "theorists".

That being said, similar arguments to the second point could be made for other cases, eg. qibojit custom operators are also independent of qibo-core, most qibocal routines are independent of circuits, etc., but if we go to this direction we will end up with a million repositories. It is just that for qibolab in particular, I can think of some use cases in the near term, for example labs that are not very interested in running circuits.

qibolab and qibocal import a whole bunch of things from qibo, however most of it seem to be primitive things that do not need to come from qibo, e.g. logger, raise_error, etc. Hopefully, the dependency can be stripped down to backends and gates only (I didn't check all imports one by one, so I am not sure if there is anything else important). This shall be solved by cleaning up the code. In fact this can be done separately as a first step.

Fully agree about the logger and errors. It is also easy to clean this up. qiboteam/qibolab#806

User directly decides what to install and what not. E.g. they decide to install qibojit along with qibo. In this case concrete backend instances can be supplied to qibo in a dependency injection manner.

This is similar to the current behavior and I agree with keeping it. Basic users will still get the numpy backend when installing qibo via qibo-core. People interested in more advanced stuff should know how to install an additional dependency in most cases.

the AbstractBackend necessarily in qibo-core

the NumpyBackend possibly in qibo-core as well

Yes, these should both be in qibo-core. I also agree with merging them and the proposal in #1222. I also think it is worth considering reducing the backend methods, but I would rather keep all methods that need to be re-implemented by at least two backends there.

I believe qibo.gates also need to be in qibo-core as they are used by both qibolab (compiler) and qibojit. These should be the only things needed in core.

the TensorflowBackend (and whatever other ML based one) in qiboml, if ever we will split it @MatteoRobbiati

the CliffordBackend in the Qinfo module, if ever we will split it @BrunoLiegiBastonLiegi @renatomello

I understand these will stay in qibo for the time-being, which I believe is fine.

alecandido commented 8 months ago

That being said, similar arguments to the second point could be made for other cases, eg. qibojit custom operators are also independent of qibo-core, most qibocal routines are independent of circuits, etc., but if we go to this direction we will end up with a million repositories. It is just that for qibolab in particular, I can think of some use cases in the near term, for example labs that are not very interested in running circuits.

I'd say that we should make qibo-core small enough that should be affordable for everyone to depend on it. I would not put too much effort in stripping that.

I believe qibo.gates also need to be in qibo-core as they are used by both qibolab (compiler) and qibojit. These should be the only things needed in core.

Yes, backends and gates are the main mechanics of Qibo. But I would have said also circuits.

I understand these will stay in qibo for the time-being, which I believe is fine.

That's the case (from my perspective). In any case, I see it as a separate issue.

hay-k commented 7 months ago

I'm not sure what you mean with circuit execution functionality: to me, the circuit executors are the backends...

Yeah, I have to admit that formulation was confusing. I meant functionality that is on top of backends, the parts of the code that call the circuit execution method(s) of backends, and analyze the returned results.

I'd say that we should make qibo-core small enough that should be affordable for everyone to depend on it. I would not put too much effort in stripping that.

This was my line of thinking as well. 99% of qibolab can be independent of qibo concepts, but at the end of the day (or maybe the decade :D) all that 99% exists to support the other 1%, which is the abstract backend implementation for qibo.

alecandido commented 7 months ago

Yeah, I have to admit that formulation was confusing. I meant functionality that is on top of backends, the parts of the code that call the circuit execution method(s) of backends, and analyze the returned results.

Oh, that is fine to have in the "frontend" package. I just wanted the bare Circuit to be in the core, but everything else is definitely on top (maybe something minimal about the output format should be kept, since it has to be standardized across different packages - but we could discuss in more details when we'll get there).

This was my line of thinking as well. 99% of qibolab can be independent of qibo concepts, but at the end of the day (or maybe the decade :D) all that 99% exists to support the other 1%, which is the abstract backend implementation for qibo.

Exactly!