-
self:1:0: F0001: No module named self (fatal)
************* Module learnng
learnng:1:0: F0001: No module named learnng (fatal)
************* Module algorithm.py
algorithm.py:1:0: F0001: No module …
-
from safe_rlhf.values.cost import CostTrainer
from safe_rlhf.values.reward import RewardTrainer
# from safe_rlhf.values.regression import RegressionTrainer
safe_rlhf.values has no regression
-
### Required prerequisites
- [X] I have read the documentation .
- [X] I have searched the [Issue Tracker](https://github.com/PKU-Alignment/safe-rlhf/issues) and [Discussions](https://github.com/PKU-…
-
./train.sh
Namespace(n='MsPacman-life_done-wm_2L512D8H-100k-seed1', seed=1, config_path='config_files/STORM.yaml', env_name='ALE/MsPacman-v5', trajectory_path='D_TRAJ/MsPacman.pkl')
A.L.E: Arcade …
-
### System Info
transformers version: 4.35.2
Platform: Linux-5.15.0-1050-aws-x86_64-with-glibc2.31
Python version: 3.10.12
Huggingface_hub version: 0.20.2
Safetensors versio…
-
### Required prerequisites
- [X] I have read the documentation .
- [X] I have searched the [Issue Tracker](https://github.com/PKU-Alignment/safe-rlhf/issues) and [Discussions](https://github.com/P…
-
![image](https://github.com/01-ai/Yi/assets/12292924/c6cc95ce-29a6-4fc5-9d16-916fd57d4119)
![image](https://github.com/01-ai/Yi/assets/12292924/6b076423-4863-4f39-ae2a-2cc4bac0f048)
环境:
V10…
-
### Required prerequisites
- [X] I have read the documentation .
- [X] I have searched the [Issue Tracker](https://github.com/PKU-Alignment/safe-rlhf/issues) and [Discussions](https://github.com/PKU-…
-
2024-02-01 15:13:58 - Rank: 4 - INFO - Loading critic model from: models/moss-rlhf-reward-model-7B-en/recover...
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████…
-
### Required prerequisites
- [X] I have searched the [Issue Tracker](https://github.com/PKU-Alignment/safe-rlhf/issues) and [Discussions](https://github.com/PKU-Alignment/safe-rlhf/discussions) tha…