-
**Describe the bug**
In trying DPO trainer example getting a bug with batch size and sharding , may be shard axis are not properly set or could be jax error as well , system used is V3 -32 , 4 hosts
…
-
Hello,
Are you planning to add support for LLaMA 2 to further pretrain the models?
-
**Describe the bug**
A clear and concise description of what the bug is.
Received an error while using pipeline generation using llama model
```
ARNING: All log messages before absl::Initiali…
-
**Describe the bug**
Don't know where the bug is maybe in FJFormer or training loop, but train loss gets to zero after 1st epoch for full finetuning , I have tried too datasets , maybe I will try ano…
-
Hi, thanks for the package! I have a custom non-HuggingFace dataset and I found the following documentation for [data processing](https://github.com/erfanzar/EasyDeL/blob/513fc33a16f6465e88a14cd74b229…
ayukh updated
1 month ago
-
Nan losses when training:
![image](https://github.com/user-attachments/assets/78126797-27e6-433c-91bb-cf8260302e6c)
Please take a look at this code:
```
!pip install jax[tpu]==0.4.28 -f https:…
-
**Describe the bug**
I want to train a reward model using Easydel with sequence classification. The classifier has been implemented in the Flax sequence classifier classes for each model, but is ther…
-
**Describe the bug**
There seems to be an issue with partitioning and sharding of the model. When running llama-3.2-1b model, we are getting 2tokens/sec rate.
```
poetry run python 1.py
WARNI…
-
i use every method in the doc and use sft and finetune for llama2 . all can not run and no one respond . do not use EasyDeL and do not use jax if you want train llm model in gpu .
-
**Describe the bug**
Can't train with multiple VM's; TPU v-4-32
It stops after loading the model, won't even load the data
Been trying for two days, maybe my set-up is wrong.
Really want to know w…