Hi, I encountered an issue while using Triton for LoRa finetuning of mpt-storywriter-4bit. The problem occurs when the program reaches the following line of code:
ValueError: Pointer argument (at 1) cannot be accessed from Triton (cpu tensor?)
This issue only occurs when fine-tuning on multiple GPUs with a model that has undergone compression using the GPTQ algorithm. Fine-tuning the same compressed model on a single GPU works without any problems.
Additionally, I've successfully fine-tuned an uncompressed 8-bit model on multiple GPUs without encountering a similar issue.
Hi, I encountered an issue while using Triton for LoRa finetuning of mpt-storywriter-4bit. The problem occurs when the program reaches the following line of code:
The error message I'm getting is:
This issue only occurs when fine-tuning on multiple GPUs with a model that has undergone compression using the GPTQ algorithm. Fine-tuning the same compressed model on a single GPU works without any problems. Additionally, I've successfully fine-tuned an uncompressed 8-bit model on multiple GPUs without encountering a similar issue.
Environment
Error Traceback