unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory

https://unsloth.ai

Apache License 2.0

18.4k stars 1.29k forks source link

Closed Erland366 closed 2 weeks ago

Erland366 commented 2 weeks ago

1251 mentioned that there's different data type between branches. Therefore we need to upcast it to `float32`

I saw other logits calculation also use tl.float32 so it should be correct

default continue pretraining give this amount of VRAM so I think this shouldn't increase any usage too .-.

Erland366 commented 2 weeks ago

tested on colab too

itshahmir commented 2 weeks ago

I am facing this issue in this notebook: