-
**Describe the bug**
When trying to follow the HuggingFace's Tutorial Huggy ([Google Colab Link](https://colab.research.google.com/github/huggingface/deep-rl-class/blob/master/notebooks/bonus-unit1/b…
-
### 🐛 Bug
When using the "check_env "function of "stable_baselines3.common.env_checker" with an environment wrapped in a "FrameStack" wrapper from "gymnasium.wrappers", I get an error on the type of …
-
## Vectorized Knowledge
Large language models are trained on general corpora and without fine-tuning on user-specific data, they struggle to utilize user-related context effectively.
Users accumu…
-
Hi tudngn!
I've seen your project on GitHub and I'm very interested to understand what have you done, because I'm working in a shepherd environment with imitation learning technique with obstacles an…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train \
--stage rm \
--do_train True \
--model_nam…
-
I am currently training in an environment that has multiple agents. In this case there are multiple snakes all on the same 11x11 grid moving around and eating food. There is one "player" snake and thr…
-
I try to sweep a set of hyperparams using the slurm submitit plugin.
I run:
`python run.py --multirun --config-name atari-slurm seed=1,2,3,4,5`
And my config file looks something like this:
…
-
在DDPG训练完也就是__prune_rl()后,应该再加一个self.create_pruner()吧,如果不加这个,感觉是在RL最后一次的compress上应用新的pruning,这应该不是正解吧!!! 感觉还是重新create_pruner()比较好一点。各位大佬看看是不是这样子?
![屏幕快照 2019-03-21 下午8 56 15](https://user-images.gith…
-
~/ChatScene-main$ PYTHONPATH='./' python scripts/run_train.py --agent_cfg=adv_scenic.yaml --scenario_cfg=train_scenario_scenic.yaml --mode train_scenario --scenario_id 1
setGPU: Setting GPU to: 0
py…
-
Hi,
I noticed that the duration of a task is decided by the code in node.py, which used np.random.randint to generate the cost-time. But if I replace it with np_random which has specified seed, the r…