rmihaylov / falcontune

Tune any FALCON in 4-bit
Apache License 2.0
468 stars 52 forks source link

RuntimeError: CUDA error: an illegal memory access was encountered #22

Open gpravi opened 1 year ago

gpravi commented 1 year ago
falcontune generate     --interactive     --model falcon-40b-instruct-4bit     
--weights gptq_model-4bit--1g.safetensors     --max_new_tokens=50     
--use_cache     --do_sample     
--instruction "Who was the first person on the moon?"

... RuntimeError: CUDA error: an illegal memory access was encountered

While trying to generate on a multiple GPU machine encountered the above error

gpravi commented 1 year ago

So just using 1 GPU with the following, export CUDA_VISIBLE_DEVICES=1

hvico commented 1 year ago

Same here, two 3090 and falcontune crashes with the same CUDA illegal memory access error.