Any continue Pretrain in TPU

google-gemini / gemma-cookbook

A collection of guides and examples for the Gemma open models from Google.

https://ai.google.dev/gemma/

Apache License 2.0

668 stars 123 forks source link

Closed msamwelmollel closed 1 month ago

msamwelmollel commented 1 month ago

I would like the cookbook to add continue pre train in TPU

No response

No response

windmaple commented 1 month ago

Added to the wishlist.

windmaple commented 1 month ago

Actually do you have access to TPU v5? If not, it's a moot point

v3/v4 (32G HBM per chip) will OOM w/ full parameter tuning for Gemma 7B. 2B doesn't work either since the sharding is diff.