-
Hi everyone,
First of all thank you for maintaining this tool!
I am trying to save a RL model trained with stablebaselines3 via mlflow, not all information is needed from the model, and stable ba…
-
### Describe your use-case.
There are multiple simple models used in this repository: Blip, Clip and WD-taggers. However, when it comes to detailed description, they are all dwarfed by modern multi…
-
https://arxiv.org/abs/1705.10843
"In unsupervised data generation tasks, besides the generation of a sample based on previous observations, one would often like to give hints to the model in order …
mrwns updated
7 years ago
-
### Issue Severity
Minor: Workaround available, torch must be installed additionally.
### What happened + What you expected to happen
PPO Trainer instantiation via RLModule API fails if I wan…
-
## Describe your environment
* Operating system: MAC Sonoma 14.4.1 (23E224)
* Python Version: Python 3.12.4 (`python -V`)
* CCXT version: ccxt==4.3.79 (`pip freeze | grep ccxt`)
* Freqtr…
-
I would like to ask for your advice on the following two questions.
1. DPO train does not seem to support DeepSpeed ZeRO. After manually integrating `DPOAlignerArguments` with the `FinetunerArguments…
-
Hello and Thank you,
i tested your library today with a KUKA FRI Connection. It works ofc, but using the given model file from rl-examples i cant get the correct Transformation of the end effector …
-
### Motivation.
In online RL training, vLLM can significantly accelerate the rollout stage. To achieve this, we need weight sync from main training process to vLLM worker process, and then call the e…
-
### What happened + What you expected to happen
/ray/rllib/examples/action_masking.py
modify:
replace action_masking.py line 97 "ppo.PPOConfig()" with" dreamerv3.DreamerV3Config()"
bug:
Va…
-
Hey,
I'm wondering if there is any intention to expand the code basis for MuZero unplugged to make it work in an offline RL setting?