-
I am trying to train a model using PPO, and the stable-baseline3[extra] library is also installed.
The issue occurs because the StochasticFrameSkip object does not have an action_space attribute, l…
-
# Description
I do not have a specific project in mind. It would be great if there is a mentor interested in this topic. I would want to work on research on Quantum Reinforcement Learning algorithms …
-
https://datawhalechina.github.io/easy-rl/#/chapter5/chapter5
Description
-
### Is your feature request related to a problem? Please describe.
UAV path planning is very crucial and also interesting to learn.
### Describe the solution you'd like.
UAV Path Planning Algorith…
-
- lambdas: discrete action space [1, 2, 3, 4, 5, 6, 7, 8] VERSUS [2, 4, 6, 7, 8] VERSUS - lambdas: continuous action space: 1 - 8.
- DDQN versus PPO
- gap_to_optimality: 0.95 VERSUS 0.8 VERSUS 0.7 V…
-
### 🐛 Describe the bug
Hello pytorch team, I am recently converting [OpenAI's tensorflow RLHF code](https://github.com/openai/lm-human-preferences) to pytorch. Given the same data and model, I was …
-
-
*please fill this in*
-
Add the [Phi-3.5-MoE-instruct](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) model.
> Phi-3.5-MoE is a lightweight, state-of-the-art open model built upon datasets used for Phi-3 - synthet…
-
These questions are specifically about utilizing discrete DMPs (implemented in this repository) for the reaching task, especially in the context of this paper (https://onlinelibrary.wiley.com/doi/abs/…