Open mehilagarwal opened 3 years ago
@amitracal @olgaatgh @vkumar19 @dalevras @deepquantum88 Can you please comment in the issue so that I can assign you?
Hi..
On Wed, 22 Sep, 2021, 4:00 pm Junye Huang, @.***> wrote:
@amitracal https://github.com/amitracal @olgaatgh https://github.com/olgaatgh @vkumar19 https://github.com/vkumar19 @dalevras https://github.com/dalevras @deepquantum88 https://github.com/deepquantum88 Can you please comment in the issue so that I can assign you?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/qiskit-advocate/qamp-fall-21/issues/35#issuecomment-924799702, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARSHDGWBLLAK6JS7MP4XR33UDGV3VANCNFSM5CU24QOA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Commenting here
Commenting...
Commenting
Commenting here
The first phase of the project is to include reinforcement learning (RL) as a classical optimization algorithm to support the quantum approximate optimization algorithm (QAOA). At each predefined step, we update the observation_space (QAOA parameter space) based on the information stored in the active_space (allowed actions and rewords). Then the expectation value is calculated on the quantum backend (or simulator) following the problem definition. If the result of the quantum circuit execution is leading the optimization process in the desirable direction, we “issue” the reward as a part of the “reinforcement” and update observation_space accordingly. Otherwise, we penalize the unfavorable choice of parameters. The algorithm runs cyclically for the given number of iterations. In the given architecture the QAOA environment is RL formulation agnostic, meaning it can accept value function or policy function optimization including the Actor-Critic method(A2C), Proximal Policy Optimization algorithm(PPO), and Deep Q Network(DQN). We currently testing this approach using a well-known max-cut problem and planning to look at more advanced problems while refining the current QAOA-RL implementation The pdf attached represents the crux of our project in the form of a flowchart. Flowchart.pdf
I am working with @amitracal @olgaatgh @da66 @vkumar19! Thank you @deepquantum88 for your guidance!
The mentees can you please upload the final presentation here?
[Uploading #35 Quantum Reinforcement Learning.pptx…]()
@HuangJunye both presentations for the two checkpoints are uploaded now @olgaatgh @deepquantum88 @da66 @vkumar19 @mehilagarwal FYI
Description
I do not have a specific project in mind. It would be great if there is a mentor interested in this topic. I would want to work on research on Quantum Reinforcement Learning algorithms or implementing existing research using Qiskit and if that exists then benchmarking those implementations.
Mentor/s
Looking for mentors
Type of participant
<What are the profiles of the ideal participants for this idea?>
Number of participants
1-3
Deliverable
Either a prototype; paper if possible. Or, Qiskit code/library for Quantum RL algorithms or benchmarking results