-
I will place suggested tags/categories both on top of and beside the links.
--
Misc
Land Value Tax and Farming Parts 1, 2, and 3 https://www.youtube.com/channel/UCw2WENjbuO_C_9cXkLU1iKg (vid…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### Reproduction
OUTPUT= OUTPUT_PATH
LR=1e-6
mkdir -p $OUTPUT
CUDA_VISIBLE_DEVICES='3' python src/train_bash.py \
…
-
-
Hi, I tried to adapt this example script to use device='auto', to support a larger model.
https://github.com/huggingface/trl/blob/main/examples/scripts/ppo_multi_adapter.py
Unfortunately it fails on…
-
查找了相关issues更新了代码等都没解决。
**环境:**
python3.10
cuda:11.8
transformers 4.31.0.dev0
torch 2.0.1+cu118
accelerate 0.21.0.dev0
peft …
-
## Which page or section is this issue related to?
It might a nice use case to be able to use the `ArgillaTrainer()` for `PPO` and `trl` for showcasing how we might use the `FeedbackDataset` to g…
-
Traceback (most recent call last):
File "reward_modeling.py", line 649, in
main()
File "reward_modeling.py", line 397, in main
model = model_class.from_pretrained(
File "/home/zyn/…
-
You are using an old version of the checkpointing format that is deprecated (We will also silently ignore `gradient_checkpointing_kwargs` in case you passed it).Please update to the new format on you…
-
from random import choices
from tqdm import tqdm
import time
import numpy as np
import ast
for epoch in range(1):
for batch in tqdm(ppo_trainer.dataloader):
(logs, game_data,) = (…
-
Source
https://github.com/meta-introspector/meta-meme/wiki/Ode-to-heideigger#ode-to-heideigger
### Summary of Our Path
1. **Initial Concepts and Inspiration**:
- We began by invoking the Mu…