lucidrains / PaLM-rlhf-pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
MIT License
7.67k stars 668 forks source link

Is it possible to release a code based on jax? #16

Closed sglucas closed 1 year ago

sglucas commented 1 year ago

Is it possible to release a code based on jax?

lucidrains commented 1 year ago

@sglucas yea it is possible

are you part of an organization or on the TRC program? there is only a handful of people / companies using TPUs outside of google and deepmind

sglucas commented 1 year ago

@lucidrains Hi, I'm supported by TRC program and I can use TPUs to do some research.

Thank you very much for your contribution.

conceptofmind commented 1 year ago

@lucidrains @sglucas

I will begin to work on a release in Jax now. Will follow the same exact structure as the PyTorch version: https://github.com/conceptofmind/PaLM-rlhf-jax

Is there a preference in a framework? I can either do Flax or Haiku. Flax is more well-maintained and Cristian / Marc are likely to answer questions in the discussion threads but if there is a strong preference for Haiku that can be done as well.

lucidrains commented 1 year ago

@lucidrains @sglucas

I will begin to work on a release in Jax now. Will follow the same exact structure as the PyTorch version: https://github.com/conceptofmind/PaLM-rlhf-jax

Is there a preference in a framework? I can either do Flax or Haiku. Flax is more well-maintained and Cristian / Marc are likely to answer questions in the discussion threads but if there is a strong preference for Haiku that can be done as well.

flax would be the safe bet, but i'm also a fan of Patrick Kidger's Equinox . i'll let you and @sglucas decide!

conceptofmind commented 1 year ago

Got accepted to the TRC program this morning as well. Going to go with the safe bet in Flax to start. Can convert to other frameworks later. I will let you know when it is finished by either adding to here or email.

Thank you,

Enrico

yejingxin commented 1 year ago

Hi Enrico,

Is there new update on this recently?

Have you got your TPU you need? If not, we (from CloudTPU) can provide more help on this.

Thanks, Jingxin

conceptofmind commented 1 year ago

Hi Enrico,

Is there new update on this recently?

Have you got your TPU you need? If not, we (from CloudTPU) can provide more help on this.

Thanks, Jingxin

Hi Jingxin,

I have been working on this locally. I have not pushed the updates to Github yet as the TRC program only gives limited access to TPUs so I have to be cautious with my use.

If you are able to provide access to TPU time for testing that would be greatly appreciated.

I will try to push more updates soon.

Thank you,

Enrico