falcontune generate --interactive --model falcon-40b-instruct-4bit
--weights gptq_model-4bit--1g.safetensors --max_new_tokens=50
--use_cache --do_sample
--instruction "Who was the first person on the moon?"
...
RuntimeError: CUDA error: an illegal memory access was encountered
While trying to generate on a multiple GPU machine encountered the above error
... RuntimeError: CUDA error: an illegal memory access was encountered
While trying to generate on a multiple GPU machine encountered the above error