-
### What happened + What you expected to happen
I have a simple gymnasium observation space which is made of 1 float box and 1 image.
Using ImpalaConfig, ComplexInputNet it is clear chosen becaus…
-
目前在文档中看到本项目实现了非常丰富的智能体模型算法,以及不同类型Env的适配,但是好像具体的benchmark试验结果汇总比较有限,存在大量的结果缺失,例如[Atari](https://xuance.readthedocs.io/zh/latest/documents/benchmark/atari.html)、MPE、MAgent等均无试验结果展示,仅有的Mujoco试验结果也不是很完整,仅…
-
How do I convert the PPO trained model (.pt) into hf format?
I tried to use this file to convert using. The following command:
```shell
python scripts/convert_checkpoint_to_hf.py \
--…
-
This document includes the features in LMFlow's roadmap. We welcome any discuss or contribute to the specific features at related Issues/PRs. 🤗
### Main Features
* Data
* [x] DPO dataset format…
-
I follow the [suggestion ](https://docs.ray.io/en/latest/rllib-training.html#specifying-resources),
`config["framework"] = "torch"`
`config["num_gpus"] = 0.001 # can't work`
`con…
-
See https://github.com/pytorch/pytorch/issues/975 for more info
PyTorch TRPO appears 50% slower than TF. Not sure about PPO, but I expect the wall-clock time gap will be the same.
To fix this is…
-
## **BUG REPORT**
**High Level Description**
Hi! Why does the latest version still have this bug?
**SMARTS version**
[0.4.17]
**Error logs and screenshots**
![image](https://user-images.gi…
-
Hello,
I am currently working with stable-baselines3 (version 1.8.0) and the sb3-contrib PPO Mask algorithm for my custom environment, FindAndAvoidV2RobotSupervisor. While running the code, I encou…
-
**System:**
- Google Colab, L4 GPU
- 1_[quickstart.ipynb](https://colab.research.google.com/github/haosulab/ManiSkill/blob/main/examples/tutorials/1_quickstart.ipynb) of ManiSkill, using sapien
…
-
https://github.com/NVIDIA-Omniverse/IsaacGymEnvs/blob/main/docs/rl_examples.md
The above website says that Ant tasks can be trained using the SAC algorithm, but there is no specific modification of t…