Could TFQ/RL_VQC work for MountainCar-v0?

Kongyj commented 3 years ago

Dear Owen,

Thanks for your implementation and the video for RL_VQC, which is a great job. The paper showed that RL_VQC could work well for MountainCar and Acrobot environments. I modified your code for both environments but failed. Do you have any idea to solve these environments? Many thanks.

lockwo commented 3 years ago

I imagine to actually get the results shown in the paper it is important to match the details of their implementation. Did you modify it to match the hyperparameters in tables 1, 2, and 3 of the appendix? Especially with mountain car, I believe the default is +1 only if the top is reached everything else is 0 (or -1), so reward function shaping is probably important to the success. Additionally, they use different optimization hyperparameters for each set of weights, did you incorporate that (you can see how to do that in https://www.tensorflow.org/quantum/tutorials/quantum_reinforcement_learning)?

Kongyj commented 3 years ago

Thanks. Your advice is really helpful and I will check all the details in my implementation according to the paper as well as the TFQ guide.

lockwo commented 3 years ago

Ok. If you have more questions you can open a new issue.

lockwo / quantum_computation

Could TFQ/RL_VQC work for MountainCar-v0? #1