easydel Search Results - Githubissues

erfanzar/EasyDeL #172

DPO trainer example

**Describe the bug** In trying DPO trainer example getting a bug with batch size and sharding , may be shard axis are not properly set or could be jax error as well , system used is V3 -32 , 4 hosts …

sparsh35 updated 1 day ago

young-geng/EasyLM #77

LLaMA 2 support for pre-training

Hello, Are you planning to add support for LLaMA 2 to further pretrain the models?

philschmid updated 1 year ago

erfanzar/EasyDeL #174

Received an error while using pipeline generation using llam…

**Describe the bug** A clear and concise description of what the bug is. Received an error while using pipeline generation using llama model ``` ARNING: All log messages before absl::Initiali…

rakesh010101 updated 3 weeks ago

erfanzar/EasyDeL #177

There is some issue with trainer , loss gets to zero after 1…

**Describe the bug** Don't know where the bug is maybe in FJFormer or training loop, but train loss gets to zero after 1st epoch for full finetuning , I have tried too datasets , maybe I will try ano…

sparsh35 updated 1 week ago

erfanzar/EasyDeL #171

Custom dataset preprocessing

Hi, thanks for the package! I have a custom non-HuggingFace dataset and I found the following documentation for [data processing](https://github.com/erfanzar/EasyDeL/blob/513fc33a16f6465e88a14cd74b229…

ayukh updated 1 month ago

erfanzar/EasyDeL #170

Nan losses with Gemma 1 DPO training on Kaggle TPU

Nan losses when training: ![image](https://github.com/user-attachments/assets/78126797-27e6-433c-91bb-cf8260302e6c) Please take a look at this code: ``` !pip install jax[tpu]==0.4.28 -f https:…

defdet updated 1 month ago

erfanzar/EasyDeL #169

How to do sequence classification training ?

**Describe the bug** I want to train a reward model using Easydel with sequence classification. The classifier has been implemented in the Flax sequence classifier classes for each model, but is ther…

sparsh35 updated 1 month ago

erfanzar/EasyDeL #175

Inference is very slow on an TPU v4 instance - 2 Tokens / Se…

**Describe the bug** There seems to be an issue with partitioning and sharding of the model. When running llama-3.2-1b model, we are getting 2tokens/sec rate. ``` poetry run python 1.py WARNI…

rakesh010101 updated 2 weeks ago

erfanzar/EasyDeL #164

EasyDeL

i use every method in the doc and use sft and finetune for llama2 . all can not run and no one respond . do not use EasyDeL and do not use jax if you want train llm model in gpu .

kuangdao updated 5 months ago

erfanzar/EasyDeL #166

TPU v4-32 set-up not working

**Describe the bug** Can't train with multiple VM's; TPU v-4-32 It stops after loading the model, won't even load the data Been trying for two days, maybe my set-up is wrong. Really want to know w…

s-smits updated 3 months ago

88 results for easydel

88 results
for easydel