Closed ant-pls-dev closed 6 months ago
Maybe trying with torch.no_grad()
can stop the OOM :)
with torch.no_grad():
a=model(input_ids, labels=input_ids).loss
It just works !
For the record, this is my setup :
==((====))== Unsloth: Fast Llama patching release 2024.4
\\ /| GPU: NVIDIA GeForce RTX 3050. Max memory: 7.777 GB. Platform = Linux.
O^O/ \_/ \ Pytorch: 2.2.2. CUDA = 8.6. CUDA Toolkit = 12.1.
\ / Bfloat16 = TRUE. Xformers = 0.0.25.post1. FA = True.
"-____-" Free Apache license: http://github.com/unslothai/unsloth
Thank you for this amazing library and support
Hello,
I would like to access the loss of the model, for example to compute perplexity, on a RTX 3050. Usual inference works great, but accessing model().loss triggers a OOM :
result :
Same result with PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
I do not know if it's a bug, unsupported or simply a not enough memory with a 3050 for this use case, or how to work around it.
Thank you