[Quantum-Cloud] Helm Chart for easy deployment of API-Server to "Deploy and Serve" Hybrid Models

arshpreetsingh commented 3 years ago

Feature details

Benefits of Deploying and Serving Hybrid Model using REST API:

End User would be able to get model results on any device, even inside Android/IOS applications if required.
Hybrid models will be easy to use and test inside other applications/Models without much hassle.
An ecosystem for hybrid models.
Each model's results could be easily tested for different goals and improved.

Query the Model using REST API

curl -d '{"input-data": [image-1.png, image-2.png, image-3.png, image-4.png]}' \
    -X POST http://localhost:8080/v1/models/hybrid-model-torch:predict

OUTPUT: ["bee","ant","bee","deer"]

Other Projects are doing it.

Tensorflow: https://www.tensorflow.org/tfx/serving/docker PyTorch: https://github.com/pytorch/serve/

Why using Helm: https://helm.sh/

To create packaging of configurations for different flavours of Pennylane(Different Interfaces or Plugins)
Easy to Deploy on Kubernetes Cluster and Use like Quantum-Cloud.
Easy to create and manage different Quantum-Nodes and communication between Quantum Nodes if required.

Implementation

By using Flask-Restful to create communication between Client and Hybrid-model running on Cloud. https://flask-restful.readthedocs.io/en/latest/

Creating volume Node for models which needs to be served. https://kubernetes.io/docs/concepts/storage/volumes/

Flask+Kubernetes: https://kubernetes.io/blog/2019/07/23/get-started-with-kubernetes-using-python/

And Finally Helm for one command Deployment and Maintenance.

How important would you say this feature is?

2: Somewhat important. Needed this quarter.

Additional information

AWSlabs: https://github.com/awslabs/multi-model-server

ForesFlow: https://github.com/ForestFlow/ForestFlow

Triton Inference server: https://github.com/triton-inference-server/server

antalszava commented 3 years ago

Hi @arshpreetsingh, thank you for the interesting feature description!

From my side, I'd further be curious about some implementation details.

In specific:

There are several components noted in the implementation section. Would all of them be required to stand up the service?
Just to double-check: the volume Node would be long running on Kubernetes, correct?
Any further details on the implementation itself would be helpful.

Thanks again!

albi3ro commented 3 years ago

Seems to me this is relevant to deploying models in production. QML is a long way from being useful in a production environment, instead of just R&D.

How would this actually be helpful at this point in time to Quantum Computation? What would be a concrete example use case?

arshpreetsingh commented 3 years ago

Hi @albi3ro @antalszava , Apology for late reply, I was involved into other things. Idea is not just Deploying QML models into production but Easy and

Fast Deployment+Usage of Multiple Flavours of Pennylane(Pennylan+TensorFlow, Pennylane+Torch,Pennylane+StraberryFields, Pennylane+Qiskit ) into Production like systems with One command.

let's suppose I want to install Specific Hybrid Flavour of Pennylane for my Project. Assuming Pennylane+TenorFlow+Torch+Qiskit So All I have to do is

helm install Pennylane_TensorFlow_Torch_Qiskit

co9olguy commented 3 years ago

Thanks @arshpreetsingh for the clarification (and the initial request :slightly_smiling_face:). Admittedly this is not something we have much experience with internally on the PennyLane team, so we have not thought much about it before. We might have to discuss internally if/how we can support

arshpreetsingh commented 3 years ago

@antalszava

Flask+any flavor of pennylane would define one service.
Yes Volume Nide will be Part of Kubernetes. Assuming that there would be locations for models which could be mapped as per type of Model.

antalszava commented 3 years ago

Hi @arshpreetsingh, thanks for the details! :slightly_smiling_face: We'll be reaching out with more info.

antalszava commented 3 years ago

Hi @arshpreetsingh, thanks again for the proposed idea here! We are keen to discuss more of the exact details here, mostly regarding the ideas for the feature and architecture itself.

Our understanding is that you would be proposing the creation of a REST API and a deployment model for deploying the API to Cloud (using Kubernetes, Helm). We'd be wondering: how could the proposed new API extend the capabilities of the PennyLane core repository? Could it be, that this is a proposal for a stand-alone product/framework?

Another thing we'd like to get more clarity on is how the endpoint would accept a PL model and trained weights, i.e., how would we tackle the serialization of a PL model to be sent via the API?

If you'd perhaps have a prototype showcasing parts of these details, that would also be great to have a look at. :slightly_smiling_face:

arshpreetsingh commented 3 years ago

Hi @antalszava Thanks for providing better insights about Architectural issue those could come with creating distributed system as discussed.

Here are my thoughts.

how could the proposed new API extend the capabilities of the PennyLane core repository? Could it be, that this is a proposal for a stand-alone product/framework?

Yes, It's more like Framework, Assuming that An Educational Institute or R&D dept. at some company wants to implement QML research so instead of having one or Multiple keys/softwares/libraries to access different kind of Quantum-systems(Strawberry-fields, cirq or IBM's) Pennylane's Quantum-Cloud will solve that issue. (This idea is highly childish or something I am open to suggestions/criticisms 😄 )

Another thing we'd like to get more clarity on is how the endpoint would accept a PL model and trained weights, i.e., how >would we tackle the serialization of a PL model to be sent via the API?

Initial Idea is to Send model as PL model as Python file to server with using scp interface(secure Copy) , like using following command (assuming that pennylane Cloud is installed on Localhost )

pl-model.py must have three functions. 1. model() , 2. train_model() 3. test_model() $ curl -F 'mymodel=/home/workspace/pl-model.py' https://localhost/user_name/upload_model

Response: {"model-compilation": Success, model-id: "1234" }

model-compilation is just initial syntax checking.

$ curl -X POST -d 'device=cirq, interface=tensorflow, wires=2, shots=1000, model-id=1234' https://localhost/user_name/train_model

Response: {"model-training": Success, model-id: "1234" }

and for model testing.

$ curl https://localhost/user_name/test_model

I agree with you about prototype, That would be more feasible to take this discussion on further level. 🚀

CatalinaAlbornoz commented 3 years ago

Hi @arshpreetsingh! I think that the framework you propose is very interesting. It may be useful for benchmarking and it can help researchers easily chose the HW that best suits their needs for a specific problem.

Would you like to work on the prototype proposed?

arshpreetsingh commented 3 years ago

Would you like to work on the prototype proposed?

Hi @CatalinaAlbornoz , Indeed, I would love to work on that. Actually I am working on that, right now in my subconscious mind. 😅

Do you have any specific plan/design in mind?

CatalinaAlbornoz commented 3 years ago

The subconscious mind can do miracles @arshpreetsingh ! It can give you lots of ideas 😄 .

I don't have any specific plan/design in mind right now so feel free to start and I will let you know if a plan/design does come up! Also don't hesitate to come to the PennyLane community calls on Thursdays and we can talk more about this!

arshpreetsingh commented 3 years ago

@CatalinaAlbornoz Great! Please share the community meeting link.

CatalinaAlbornoz commented 3 years ago

@arshpreetsingh The community calls are on the Unitary Fund Discord server. You can join here https://discord.com/invite/JqVGmpkP96

There on the left you will community-call within the voice channels. That's where we meet on Thursdays at 11am ET!

PennyLaneAI / pennylane