kingoflolz / mesh-transformer-jax

Model parallel transformers in JAX and Haiku
Apache License 2.0
6.26k stars 890 forks source link

Finetuning Hardware Recomendations #258

Open greyweb opened 12 months ago

greyweb commented 12 months ago

Hi, I am trying to finetune GPT-J 6B from HF converted weights. It would be great to know some recommendations on the finetuning compute widely used/ suggested for GPT-J 6B.