allenai / RL4LMs

A modular RL library to fine-tune language models to human preferences
https://rl4lms.apps.allenai.org/
Apache License 2.0
2.13k stars 191 forks source link

Using GPT-2 #55

Open oroojlooy opened 1 year ago

oroojlooy commented 1 year ago

In the README, it is mentioned that Actor-Critic Policies supporting causal LMs (eg. GPT-2/3) and seq2seq LMs (eg. T5, BART). I was wondering how I can use GPT-2 model? I see the following call stack on instantiating and loading the model by default instantiates a seq2seq (encoder-decoder) model. Now, the question is how can I switch to a regular GPT model.

train_text_generation.py:84
train_text_generation.py:46 -> main()
training_utils.py:149 -> __init__()
training_utils.py:164 -> _setup()
training_utils.py:118 -> build_alg()
alg_wrappers.py:407 -> wrap_onpolicy_alg()
alg_wrappers.py:108 -> __init__()
ppo.py:166 -> __init__()
ppo.py:169 -> _setup_model()
on_policy_algorithm.py:117 -> _setup_model()
seq2seq_policy.py:51 -> __init__()
base_policy.py:135 -> __init__()
seq2seq_policy.py:67 -> _build_model_heads() (It gets a deepcopy of _policy_model as _ref_model. Then this calls from_pretrained for _policy_model, _value_model, and _ref_model to load the model parameters)
auto_factory.py:446 -> from_pretrained()