How to solve the 'RESOURCE_EXHAUSTED' error when loading 'gemma2_instruct_2b_en' (the script is from kaggle and runs on colab with TPU)?

google / gemma_pytorch

The official PyTorch implementation of Google's Gemma models

https://ai.google.dev/gemma

Apache License 2.0

5.3k stars 508 forks source link

How to solve the 'RESOURCE_EXHAUSTED' error when loading 'gemma2_instruct_2b_en' (the script is from kaggle and runs on colab with TPU)? #72

Open nicewang opened 1 month ago

nicewang commented 1 month ago

How to solve the 'RESOURCE_EXHAUSTED' error when loading 'gemma2_instruct_2b_en' (the script is from kaggle and runs on colab with TPU)? Errors shown following:

The environment is the colab opened from kaggle notebook, and with TPU v2-8 accelerator, and: RAM：6.02 GB/334.56 GB Disk：22.13 GB/225.33 GB

Had changed the XLA_PYTHON_CLIENT_MEM_FRACTION from 0.1 to 1.00, but seems useless:

Gopi-Uppari commented 1 month ago

Hi @nicewang,

I encountered the same issue when running on Google Colab with the runtime set to TPU v2-8, but it worked fine on Kaggle with the TPU VM v3-8 runtime. Could you please refer to this Gist file

@pengchongjin Could you please take a look at the issue.

Thank you.

nicewang commented 1 month ago

Thx @Gopi-Uppari ,

I had followed your shared colab file, and finally succeeded both on Kaggle with TPU VM v3-8 and colab with TPU v2-8.

Further, I am just curious about whether there is any dependencies conflicts issues? since I just changed my dependencies installation from: to:

Gopi-Uppari commented 2 weeks ago

Hi @nicewang,

I'm glad it worked for you, that could possibly be the reason for the dependency conflicts.

Thank you.