issues
search
EdoardoPona
/
predicting-inductive-biases-RL
fork of https://openreview.net/forum?id=mNtmhaDkAr - extending for inductive bias in RL
1
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
default number of steps between rl4lm and trl is drastically different
#27
EdoardoPona
closed
12 months ago
1
plot trl runs
#26
EdoardoPona
closed
12 months ago
0
reward inconsistency with trl controlled sentiment generation example
#25
EdoardoPona
closed
12 months ago
2
remove duplication in lovering main.py and rl_main.py
#24
EdoardoPona
opened
1 year ago
0
Loading arbitrary AutoModels in the Lovering code to run mdl
#23
EdoardoPona
closed
1 year ago
1
Rewards without confidence
#22
EdoardoPona
closed
1 year ago
1
Warm up GPT2 on review data
#21
diogo-cruz
closed
1 year ago
1
Improve plots
#20
diogo-cruz
opened
1 year ago
0
Implementing and running different LLM tasks
#19
diogo-cruz
opened
1 year ago
1
batched rewards
#18
EdoardoPona
opened
1 year ago
0
Sentiment task: finding best hyperparameters.
#17
diogo-cruz
opened
1 year ago
2
Understanding MDL calculations
#16
diogo-cruz
opened
1 year ago
1
llm finetuning
#15
EdoardoPona
opened
1 year ago
2
Clean up toy+RL setup
#14
diogo-cruz
opened
1 year ago
0
Implement GPT-2 finetuning for sentiment generation
#13
diogo-cruz
closed
1 year ago
4
Implement sentiment reward
#12
diogo-cruz
closed
1 year ago
4
Implement sentiment dataset
#11
diogo-cruz
closed
1 year ago
5
run evaluations on test set with feature combination subsets
#10
EdoardoPona
closed
1 year ago
1
Fix issue with generating multi-token
#9
diogo-cruz
opened
1 year ago
1
Implement Alex's reward and test it
#8
diogo-cruz
closed
1 year ago
1
Fine tuning toy models with RL
#7
EdoardoPona
closed
1 year ago
2
Results collection for RL models
#6
EdoardoPona
closed
1 year ago
3
Link lovering data with rl4lm dataset class
#5
EdoardoPona
closed
1 year ago
1
Load custom transformer in RL4LM
#4
EdoardoPona
closed
1 year ago
2
Removing erroneous probes.
#3
diogo-cruz
closed
1 year ago
0
Toy transformer
#2
diogo-cruz
closed
1 year ago
0
Scripts for finetuning and probing runs, and modified files to plot figures
#1
diogo-cruz
closed
1 year ago
0