-
![image](https://github.com/user-attachments/assets/4a219c5f-226d-49be-973e-86fc384d0bf6)
![image](https://github.com/user-attachments/assets/7fdab145-e875-4a46-8164-67a460d63b9a)
I tried running yo…
-
### Describe the bug
Wandb sweep agent cuts in the thread of Isaac Sim before calling app.update(), which causes the process to hang forever.
### Steps to reproduce
```python
# Copyright (c)…
-
Hello! I tried an experiment using the llama2 13b model and got a CONNECTION ERROR.
**RL script**
> python -m lamorel_launcher.launch --config-path /home/xxx/Grounding_LLMs_with_online_RL/lamorel…
-
[paper](https://arxiv.org/abs/2305.18290)
## TL;DR
- **I read this because.. :** 배경지식 차
- **task :** RL
- **problem :** TRPO도 별도의 Reward model을 학습해야 하는데 모델이 커짐에 따라 너무 힘듦
- **idea :** rewa…
-
在DDPG训练完也就是__prune_rl()后,应该再加一个self.create_pruner()吧,如果不加这个,感觉是在RL最后一次的compress上应用新的pruning,这应该不是正解吧!!! 感觉还是重新create_pruner()比较好一点。各位大佬看看是不是这样子?
![屏幕快照 2019-03-21 下午8 56 15](https://user-images.gith…
-
AutoModelForCausalLM 中class没有chatglm你是如何解决的呢
-
With the proliferation of models and model variants it becomes more important to track assessment dates and model versions.
So far we've been able to treat model families as one, because it rarely …
-
I was able to fine-tune with a modified version of example 2 with the following action head:
```
config["model"]["heads"]["action"] = ModuleSpec.create(
L1ActionHead,
pred_horizon=9,
…
-
Hi, thanks for open-sourcing your amazing work!
I have been trying to reproduce the RL fine-tuned results reported in the paper, but unfortunately, I am encountering some issues. Here is a brief o…
-
Thank you for your great work. I'm interested in reproducing your results.
```bash
python run.py --exp-config ./configs/experiments/XGX.yaml --run-type train
```
However, I encountered an issue …