Closed Erland366 closed 2 weeks ago
float32
I saw other logits calculation also use tl.float32 so it should be correct
tl.float32
default continue pretraining give this amount of VRAM so I think this shouldn't increase any usage too .-.
tested on colab too
I am facing this issue in this notebook:
https://colab.research.google.com/drive/1lN6hPQveB_mHSnTOYifygFcrO8C1bxq4?usp=sharing#scrollTo=yqxqAZ7KJ4oL
1251 mentioned that there's different data type between branches. Therefore we need to upcast it to
float32
I saw other logits calculation also use
tl.float32
so it should be correctdefault continue pretraining give this amount of VRAM so I think this shouldn't increase any usage too .-.