About ChatGPT or InstructGPT

jaymody / gpt-jax

A stupidly simple GPT implementation in JAX.

7 stars 1 forks source link

About ChatGPT or InstructGPT #1

Open sglucas opened 1 year ago

sglucas commented 1 year ago

Hi, very nice repo.

May I ask do you plan to reproduce ChatGPT/InstructGPT or GPT with RLHF based on JAX?

Best

jaymody commented 1 year ago

Not sure what you mean. InstructGPT's architecture is the same as GPT-3. InstructGPT is just fine-tuned using RLHF.

If you're asking if I plan to implement the reward model and policy used to fine-tune InstructGPT, I will not be implementing that.

My goal with this repo is to provide a simple, readable, and hackable GPT implementation for educational purposes. RLHF is definitely outside the scope of those criterion.