issues
search
lucidrains
/
self-rewarding-lm-pytorch
Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI
MIT License
1.32k
stars
73
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
What's the reference model for DPO?
#31
Draconda
closed
5 months ago
1
OSError: [Errno 22] Invalid argument: 'preference_seq.memmap.npy'
#30
Oloup
opened
5 months ago
0
Fixed deep copy, shallow copy error and label mask error.
#29
Control-derek
closed
5 months ago
1
Solves the problem that some variables are not declared
#28
Control-derek
closed
6 months ago
1
Solves the problem that some variables are not declared
#27
Control-derek
closed
6 months ago
1
add self.
#26
Control-derek
closed
6 months ago
1
ModuleNotFoundError: No module named 'x_transformers'
#25
mayankpathaklumiq
opened
7 months ago
1
UnboundLocalError: local variable 'self_reward_model' referenced before assignment
#24
UbeCc
closed
2 months ago
3
What changes should I make to apply the method on Llama2?
#23
Labmem009
opened
7 months ago
0
I encountered the following error when trying to run usage
#21
Yanfors
opened
7 months ago
1
Fix TypeError for is_valid_reward in SelfRewardDPOConfig
#19
ViswanathaReddyGajjala
closed
7 months ago
1
TypeError: tuple indices must be integers or slices, not tuple
#18
fakerybakery
opened
7 months ago
1
Update self_rewarding_lm_pytorch.py
#17
unaidedelf8777
closed
7 months ago
1
RuntimeError: Placeholder storage has not been allocated on MPS device!
#15
fakerybakery
closed
8 months ago
2
Multiple GPUs
#14
fakerybakery
closed
8 months ago
0
Update self_rewarding_lm_pytorch.py
#13
Dyke-F
closed
8 months ago
1
Update spin.py
#12
Dyke-F
closed
8 months ago
2
Why use a custom sample function instead of original HuggingFace generate() function?
#11
scarydemon2
closed
8 months ago
1
How to use HF Transformers model
#10
fakerybakery
opened
8 months ago
3
Default `iteration` about SPIN. (Reward model~Policy model)
#9
KyujinHan
closed
8 months ago
1
run spin demo
#8
westlongtime
closed
8 months ago
3
The reward prompt is weak.
#7
Minami-su
closed
8 months ago
6
Update README.md
#5
eltociear
closed
8 months ago
1
Is this work in progress?
#4
jbdatascience
closed
8 months ago
4
Help with Setting up and running ?
#3
badboysm890
closed
8 months ago
1
code and dataset?
#1
wanghao-007
closed
8 months ago
0