-
**Describe the bug**
Out of memory for a smaller gpt2 model with 150M params
```
jaxlib.xla_extension.XlaRuntimeError: RESOURCE_EXHAUSTED: XLA:TPU compile permanent error. Ran out of memory in me…
-
**Describe the bug**
while trying to run tinyllama
```
File "/home/**/research/easydel/.venv/lib/python3.10/site-packages/EasyDel/modules/llama/modelling_llama_flax.py", line 933, in __call__
…
-
Hi @erfanzar,
Thanks for the great repo! It looks really useful for training open-source models on TPU and GPUs!
I wonder if it is easy to implement the feature that allows users to pass in pack…
-
**Describe the bug**
An error while training tiny llama on kaggle
```
/root
/usr/local/lib/python3.10/site-packages/pydantic/_internal/_fields.py:149: UserWarning: Field "model_name" has conflic…
-
**Describe the bug**
Hi, I really appreciate your continued commitment to this project and make it better and better. I'm one of the people who benefit greatly. Thank you.
Now, I am trying to fine…
-
**Describe the bug**
Hi, I ran below code on Kaggle's tpu vm v3-8. when i set the attn_mechanism to "normal", it worked well. However, when I changed the attn_mechanism to ring. below error raised. C…
-
**Describe the bug**
Hi, I tried to finetune gemma-2b model with sharding_array=(1, 1, 1, -1) on Kaggle tpu vm v3-8.
there are two parameters about batch size in TrainArguments: total_batch_size, …
-
**To Reproduce**
```
Time Took to Complete Task configure dataloaders (microseconds) : 0.3025531768798828
Time Took to Complete Task configure Model ,Optimizer ,Scheduler and Config (microsec…
-
```
ValueError: Loading this model requires you to execute custom code contained in the model repository on your local machine. Please set the option `trust_remote_code=True` to permit loading of thi…
-
**Describe the bug**
Getting the following error while running llama model after training using EasyDel and converting to Hugging face.
```
python serve_llama_tpu_easydel.py
Loading checkpoint…