Closed MultiTrickFox closed 1 year ago
Hi @developersfox, thank you so much for your question! :confetti_ball:
Allowing this addition would make a great feature indeed! :slightly_smiling_face: It would indeed help out with using different parameters and moving towards using QNodeCollections more optimally.
Let us know if you'd be interested in being a contributor for creating the addition, we'd be happy to start a discussion about how this could be implemented!
Thanks @developersfox! As @antalszava noted, this is a feature we would love to add.
PS: Would this sytem work with GPU tensors of pytorch, i.e. not copying them from gpu but pennylane working on the gpu directly.
Unfortunately, for PennyLane to perform quantum simulations on the GPU requires support from the quantum device. Currently, in the master branch, we have a default.qubit.tf
quantum device, that supports executing TensorFlow QNodes on GPUs. We currently don't have a device that supports Torch QNodes on GPUs, due to Torch's lack of support for complex numbers.
@antalszava , @josh146 Hello, Thank you for your answer.. I would like to definitely give a go at the implementation, although I must say I'm coming from the qiskit side as well as pytorch, so I'm not familiar with pennylane internals as much as them. But it doesn't mean I'm shy to read source :) Please show me where to start and the general roadmap i have to take..
In the meantime, is there any way I can parallelize this batch circuit process? My current architecture has a major bottleneck at this circuit part I have done a threadpool, where, as you know, the context modification error occurs. I cannot do a multicore pool, as the gradient information is lost. Would there be any present fix to at least speed up this bottleneck a little?
Update: After reading your source code for QNodeCollections, I see that what it does is feed the params group into different qnodes. (& dask them if parallel.)
What i'm proposing for the batch prop is as follows, 1) extract a parametric matrix representation of the qnode (circuit as single matrix with parametric values) 2) for each params_group in batch, feed the params tensor into the parametric matrix 3) stack all these filled parametric matrices, to get the "batch circuits stacked together"
4) stack the starting quantum states (assume i.e. all are 00) as well, and do a starting_states * batch_circuits => final_states; where you can calc probabilities etc from the final states.
I have written this based on the assumption that matrix representation of some sort could be obtained in your lib, and be in parametric form. Similar to Operator(circuit) in qiskit, but a parametric version of this.
In the meantime, is there any way I can parallelize this batch circuit process? My current architecture has a major bottleneck at this circuit part
I haven't tried this myself, but something that could work is simply creating a new class/subclass of QNodeCollection
that inverts the current assumption; rather than having multiple QNodes with the same argument, this collection would represent a single QNode with a list of different input arguments.
Then, it is simply a matter of adjusting the existing parallel code. Perhaps something like the following?
for a in args:
results.append(dask.delayed(self.qnode)(*args, **kwargs))
return dask.compute(*results, scheduler=_scheduler)
Our end goal would be to generalize these even further (multiple QNodes + multiple arguments), but this approach seems sufficient to your needs above 🙂
I have written this based on the assumption that matrix representation of some sort could be obtained in your lib, and be in parametric form.
PennyLane might differ from Qiskit in this regard, in that the frontend components always treat the QNode as a 'black box' --- there is no method to convert a QNode to a parametric matrix representation; instead, this is left up to the device (simulator or hardware) and the interface (Autograd/Torch/TensorFlow).
Instead, you can always treat the QNode as a end-to-end differentiable parametrized function, and use this to compute the final batched states.
@josh146 , @MultiTrickFox - Can I start on this one? I got your point @josh146 to create a new class/subclass and keep the code as generic as possible. I am from Machine Learning background, recently started contributing to qiskit. I am really interested in working on the pennyLane library.
Hi @sagarpahwa!
It's great that you'd like to contribute to PennyLane. We welcome contributions :smile:
This particular issue is still open because we're in the process of upgrading the core of the library, and we wanted to make sure that was well-established before we started building on top of it (like with this issue). The risk of tackling this issue is that any fix might be short-lived. As such, I'd recommend not taking not taking this particular issue on at the moment, until the new functionality is a bit more established in the core.
Thank you @co9olguy for the update.
@sagarpahwa yes please take on the issue. I ended up implementing the system I described above. I wish I could do it in PL, but the simulator is independent from the framework. If PL has a matrix level simulator sometime in the future I would hope to add to this..
Thanks @MultiTrickFox for the update. I will start working on this post 7 am IST on 11 October. I am working on an assignment submission for now. Let me know if that's fine.
@sagarpahwa Yeah totally fine. Gl on your assignment too
Hi @MultiTrickFox , please excuse me for delayed response, I was trying to arrange some time to work upon this one. For now, I have some other exam related priorities and won't be able to devote anytime soon.
I'm going to close this issue, because we now have parameter batching/broadcasting in PennyLane! It is being implemented on a case-by-case basis, so if you find that you want to use an operator that doesn't support broadcasting yet, let us know and we can start working on support. Or if you're feeling brave, you can put up a PR and tag a team member for review/help 😄
Meanwhile I graduated from my Msc program..
Congrats on your MSc @MultiTrickFox!
Issue Description: Propogating copies of the same circuit, with different parametric values, => for batch propogation in neural nets.
Requested behavior: let my_qnode = qnode(circuit_fn(params)) let nn_out = matrix [batch_size,circuit_params] QNodeCollection(myqnode,nn_out) etc..
Further description: While doing batch propogation for the layer, output is received as [batch_size,params] so each individual "params" input for the same circuit must be sliced from list (which takes "a lot" of python time), instead what if a QNodeCollection could be defined for the same circuit, with the input tensor directly? Would this be possible
PS: Would this sytem work with GPU tensors of pytorch, i.e. not copying them from gpu but pennylane working on the gpu directly.
Thank you..