Open erickFBG opened 12 months ago
Can you check if the files are somehow damaged? What gpus are these?
The files its ok, i have 3 nvidia a2000 running with 6Gib. i have already test another model (fastchat-t5-3b-v1.0) in single gpu and is working but when i try use 3 the same output is returning.
I see. And can you use three instances of the smaller model, one on each gpu, just to make sure you can run them?
i tested here a small model and is working, i load fastchat-t5-3b-v1.0 with 8bit on each gpu.
I made some news tests and i see a thing Interesting, when a run with just two gpu the model doesn't broken but if i trial run with three still broken.
This code below is return an error, running in multi GPUs.
python3 -m fastchat.serve.cli --model-path lmsys/vicuna-7b-v1.3 --num-gpus 3 --max-gpu-memory 6GiB
USER: what can you do? ASSISTANT: пеamaire favor Arab Dra benep) Rev1 Storage78ITE마zigate point fail4check therefore ':-erserserszeroispecies Theirulleselfoproreu'> definitionSsm Traceback (most recent call last):
File "/FastChat/fastchat/serve/inference.py", line 174, in generate_stream indices = torch.multinomial(probs, num_samples=2) RuntimeError: probability tensor contains either
inf,
nanor element < 0