salesforce / jaxformer

Minimal library to train LLMs on TPU in JAX with pjit().
BSD 3-Clause "New" or "Revised" License
277 stars 37 forks source link

Request for training configuration of CodeGen 16B #24

Open sh0416 opened 1 year ago

sh0416 commented 1 year ago

I want to finetune the 16B scale codegen checkpoint using TPU.

In the config directory, there is no configuration for that.

Could you share about the configuration? or some documentation for scaling model parameter?

sh0416 commented 1 year ago

FYI, I am planning to use TPUv3-256 or more core for that.