QNodeCollection for Same Circuit, with Different Params

MultiTrickFox commented 4 years ago

Issue Description: Propogating copies of the same circuit, with different parametric values, => for batch propogation in neural nets.

Requested behavior: let my_qnode = qnode(circuit_fn(params)) let nn_out = matrix [batch_size,circuit_params] QNodeCollection(myqnode,nn_out) etc..

Further description: While doing batch propogation for the layer, output is received as [batch_size,params] so each individual "params" input for the same circuit must be sliced from list (which takes "a lot" of python time), instead what if a QNodeCollection could be defined for the same circuit, with the input tensor directly? Would this be possible

PS: Would this sytem work with GPU tensors of pytorch, i.e. not copying them from gpu but pennylane working on the gpu directly.

Thank you..

antalszava commented 4 years ago

Hi @developersfox, thank you so much for your question! :confetti_ball:

Allowing this addition would make a great feature indeed! :slightly_smiling_face: It would indeed help out with using different parameters and moving towards using QNodeCollections more optimally.

Let us know if you'd be interested in being a contributor for creating the addition, we'd be happy to start a discussion about how this could be implemented!

josh146 commented 4 years ago

Thanks @developersfox! As @antalszava noted, this is a feature we would love to add.

PS: Would this sytem work with GPU tensors of pytorch, i.e. not copying them from gpu but pennylane working on the gpu directly.

Unfortunately, for PennyLane to perform quantum simulations on the GPU requires support from the quantum device. Currently, in the master branch, we have a default.qubit.tf quantum device, that supports executing TensorFlow QNodes on GPUs. We currently don't have a device that supports Torch QNodes on GPUs, due to Torch's lack of support for complex numbers.

MultiTrickFox commented 4 years ago

@antalszava , @josh146 Hello, Thank you for your answer.. I would like to definitely give a go at the implementation, although I must say I'm coming from the qiskit side as well as pytorch, so I'm not familiar with pennylane internals as much as them. But it doesn't mean I'm shy to read source :) Please show me where to start and the general roadmap i have to take..

MultiTrickFox commented 4 years ago

In the meantime, is there any way I can parallelize this batch circuit process? My current architecture has a major bottleneck at this circuit part I have done a threadpool, where, as you know, the context modification error occurs. I cannot do a multicore pool, as the gradient information is lost. Would there be any present fix to at least speed up this bottleneck a little?

MultiTrickFox commented 4 years ago

Update: After reading your source code for QNodeCollections, I see that what it does is feed the params group into different qnodes. (& dask them if parallel.)

What i'm proposing for the batch prop is as follows, 1) extract a parametric matrix representation of the qnode (circuit as single matrix with parametric values) 2) for each params_group in batch, feed the params tensor into the parametric matrix 3) stack all these filled parametric matrices, to get the "batch circuits stacked together"

4) stack the starting quantum states (assume i.e. all are 00) as well, and do a starting_states * batch_circuits => final_states; where you can calc probabilities etc from the final states.

I have written this based on the assumption that matrix representation of some sort could be obtained in your lib, and be in parametric form. Similar to Operator(circuit) in qiskit, but a parametric version of this.

josh146 commented 4 years ago

In the meantime, is there any way I can parallelize this batch circuit process? My current architecture has a major bottleneck at this circuit part

I haven't tried this myself, but something that could work is simply creating a new class/subclass of QNodeCollection that inverts the current assumption; rather than having multiple QNodes with the same argument, this collection would represent a single QNode with a list of different input arguments.

Then, it is simply a matter of adjusting the existing parallel code. Perhaps something like the following?

for a in args:
    results.append(dask.delayed(self.qnode)(*args, **kwargs))

return dask.compute(*results, scheduler=_scheduler)

Our end goal would be to generalize these even further (multiple QNodes + multiple arguments), but this approach seems sufficient to your needs above 🙂

I have written this based on the assumption that matrix representation of some sort could be obtained in your lib, and be in parametric form.

PennyLane might differ from Qiskit in this regard, in that the frontend components always treat the QNode as a 'black box' --- there is no method to convert a QNode to a parametric matrix representation; instead, this is left up to the device (simulator or hardware) and the interface (Autograd/Torch/TensorFlow).

Instead, you can always treat the QNode as a end-to-end differentiable parametrized function, and use this to compute the final batched states.

sagarpahwa commented 3 years ago

@josh146 , @MultiTrickFox - Can I start on this one? I got your point @josh146 to create a new class/subclass and keep the code as generic as possible. I am from Machine Learning background, recently started contributing to qiskit. I am really interested in working on the pennyLane library.

co9olguy commented 3 years ago

Hi @sagarpahwa!

It's great that you'd like to contribute to PennyLane. We welcome contributions :smile:

This particular issue is still open because we're in the process of upgrading the core of the library, and we wanted to make sure that was well-established before we started building on top of it (like with this issue). The risk of tackling this issue is that any fix might be short-lived. As such, I'd recommend not taking not taking this particular issue on at the moment, until the new functionality is a bit more established in the core.

sagarpahwa commented 3 years ago

Thank you @co9olguy for the update.

MultiTrickFox commented 3 years ago

@sagarpahwa yes please take on the issue. I ended up implementing the system I described above. I wish I could do it in PL, but the simulator is independent from the framework. If PL has a matrix level simulator sometime in the future I would hope to add to this..

sagarpahwa commented 3 years ago

Thanks @MultiTrickFox for the update. I will start working on this post 7 am IST on 11 October. I am working on an assignment submission for now. Let me know if that's fine.

MultiTrickFox commented 3 years ago

@sagarpahwa Yeah totally fine. Gl on your assignment too

sagarpahwa commented 3 years ago

Hi @MultiTrickFox , please excuse me for delayed response, I was trying to arrange some time to work upon this one. For now, I have some other exam related priorities and won't be able to devote anytime soon.

timmysilv commented 1 year ago

I'm going to close this issue, because we now have parameter batching/broadcasting in PennyLane! It is being implemented on a case-by-case basis, so if you find that you want to use an operator that doesn't support broadcasting yet, let us know and we can start working on support. Or if you're feeling brave, you can put up a PR and tag a team member for review/help 😄

MultiTrickFox commented 1 year ago

Meanwhile I graduated from my Msc program..

josh146 commented 1 year ago

Congrats on your MSc @MultiTrickFox!

PennyLaneAI / pennylane

QNodeCollection for Same Circuit, with Different Params #692