lucidrains / PaLM-rlhf-pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
MIT License
7.67k stars 668 forks source link

PaLM-rlhf-pytorch Roadmap #18

Closed HappyPony closed 1 year ago

HappyPony commented 1 year ago

Hi,

Unfortunately, comments in this thread An Open-Source Version of ChatGPT is Coming sound too technical to my ears. I would like to have a summary, a roadmap on who, what, with which means is starting or wants to start now. So that I would have an idea in the role as a user, when can I get involved to - just like I am currently doing for ChatGPT OpenAI, train the OpenSource language model.

lucidrains commented 1 year ago

@HappyPony if you aren't doing a phd, the only way to participate is from the data angle. there is also potentially room to contribute in building the application for collecting human feedback to train the rewards model, but right now it is uncertain if this approach will be usurped by something like RL"AI"F, as Anthropic is promoting

i would suggest joining Laion and just helping out with Yannic Kilcher's similar efforts

lucidrains commented 1 year ago

@HappyPony if you truly want to understand what is going on beneath the surface, without getting a graduate degree, i highly recommend starting with fast.ai, before working your way into transformers and reinforcement learning

HappyPony commented 1 year ago

thank you for the feedback @lucidrains. For understanding - I have no ambition to contribute greatly in the development of the models or algorithms. Although I have a university degree in physics and programming experience. But I am interested to contribute as a tester. And I'd like to have an idea of the timescales involved until there is something to test ;-)

lucidrains commented 1 year ago

@HappyPony yea, i would say, go mingle with the people doing the real work and see what they need

mainly Laion and CarperAI at this point