Open iceman-p opened 5 months ago
Same problem
Same issue #1039
Bumping the vram to 80GB appears to have resolved it for me. Possibly a OOM error?
Bumping the vram to 80GB appears to have resolved it for me. Possibly a OOM error?
That would explain why when I restrict cuda visibility to a single 48gb card I get the error, but it doesn't solve the main problem: two 48gb cards should(tm) provide enough vram and the main bug here is it isn't splitting between the cards.
Same issue here. Were you able to fix that ? @levi @iceman-p
I suspect this is related to device="auto"
and low_cpu_mem_usage=True
@iceman-p Hi how did you load the 7b one? I am having trouble loading as i get https://github.com/haotian-liu/LLaVA/issues/1112
Describe the issue
Issue: When starting a worker with 34B version of the 1.6 model, the worker will crash on the first inference. I've verified that the mistal-7b version does work and I can run the demo with the mistral version; this only happens on the 34B:
Command:
Log:
Given the error is complaining about tensors on two cuda devices on this machine (this is a 2x6000 workstation), I tried running with CUDA_VISIBLE_DEVICES=0 to have it only work with a single card, but that also doesn't work: the worker doesn't even successfully launch itself, hard crashing before it communicates with the gradio process:
Command:
Log:
Current commit ran: c878cc3e66f75eb8227870be3d30268789913f82