Closed rllyryan closed 2 years ago
Hi, Thank you for the question. As the output of the environment simulator, WarpDrive provides data type as Pytorch tensor, therefore for the training part, WarpDrive has no difference from any other RL infrastructure using Python/Pytorch. You can fulfill any training algorithm written by Pytorch. For Q learning, you can refer to our A2C example, and changes to the corresponding Q learning algorithm; or we have the Lightning example showing how we directly integrate with Pytorch Lightning for the training, in that way, it should be even easier for you to grab a Lightning trainer.
Hi, Thank you for the question. As the output of the environment simulator, WarpDrive provides data type as Pytorch tensor, therefore for the training part, WarpDrive has no difference from any other RL infrastructure using Python/Pytorch. You can fulfill any training algorithm written by Pytorch. For Q learning, you can refer to our A2C example, and changes to the corresponding Q learning algorithm; or we have the Lightning example showing how we directly integrate with Pytorch Lightning for the training, in that way, it should be even easier for you to grab a Lightning trainer.
Hi @Emerald01,
Thank you for your explanation and suggestion. I will take a look at Pytorch Lightning trainer, it seems pretty good at skipping boilerplate codelines. As for the A2C example, could I ask which tutorial exactly are you referring to?
I mean the trainer itself is a typical Pytorch trainer since the output data is torch tensor https://github.com/salesforce/warp-drive/blob/master/warp_drive/training/algorithms/a2c.py
So if you like to use Q learning or any other algorithm, you can borrow directly and use for WarpDrive .
I mean the trainer itself is a typical Pytorch trainer since the output data is torch tensor https://github.com/salesforce/warp-drive/blob/master/warp_drive/training/algorithms/a2c.py
So if you like to use Q learning or any other algorithm, you can borrow directly and use for WarpDrive .
Understand! Thank you!
Dear WarpDrive Team,
May I find out if it is possible to implement other reinforcement learning algorithms into WarpDrive (i.e., Q-Learning)?
If not, may I ask whether PPO and A2C are considered one of the better algorithms out there in the field? I am not that well informed of the algorithms and their individual advantages, but from what I have garnered from online searches:
Reference: https://medium.datadriveninvestor.com/which-reinforcement-learning-rl-algorithm-to-use-where-when-and-in-what-scenario-e3e7617fb0b1