Open AetherPrior opened 4 days ago
Hi, what is your PyTorch version? The auto mode should work in torch=2.4.0 and transformers=4.44.2. Additionally, this issue might arise from using a small epsilon (1e-9) in the entropy calculation, although it should be functional in this version.
I've adjusted the epsilon to 1e-6; please check if it works 😊.
Hi, I suspect the issue might still be related to the versions of PyTorch and transformers. I tested the llama2-7b-chat model on both a four-GPU RTX 4090 setup and a single RTX 4090, and the results were consistent across both configurations, even using an epsilon of 1e-9.
I see, my versions are:
torch==2.5.1
transformers==4.46.3
Let me try downgrading them and checking
Hi all, I am trying to use glitchminer to inference a few of archangel's models and some of them are too big to fit on a single GPU. Consequently, I'm setting
device_map='auto'
to spread them across all GPUs in my single node.However, upon running the GlitchMiner, I get
nan
values filled in for my entropies: Source:My output's like this for
device_map='auto'
:and when I replace
device_map='auto'
withdevice_map='cuda'
, I get this:I assume this is because of gradient computations that do not support multi-gpu loaded models.
Was the entire setup was run on a single GPU?