Open kdunee opened 2 months ago
I tracked the issue down to unsloth/models/_utils.py
patching accelerate.utils.operations.send_to_device
.
For example, this code works:
import accelerate
accelerate.utils.operations.send_to_device({
"commit": ["Hello",]
}, "gpu")
This code doesn't (same as above, TypeError: 'str' object is not callable
):
import accelerate
from unsloth.models import _utils
accelerate.utils.operations.send_to_device({
"commit": ["Hello",]
}, "gpu")
I ended up temporarily running this code at the beginning of my notebook to "unpatch" send_to_device
:
import accelerate
orig = accelerate.utils.operations.send_to_device
from unsloth.models import _utils
accelerate.utils.operations.send_to_device = orig
Will check and fix! Sorry on the issue!
I also get this error with the DPO Zephyr Unsloth Example (https://colab.research.google.com/drive/15vttTpzzVXv_tJwEk-hIcQ0S9FcEWvwP?usp=sharing). Interestingly, it doesn't raise this issue with the default "unsloth/zephyr-sft-bnb-4bit" but when I used my own model (mlabonne/TwinLlama-3.1-8B).
Applying @kdunee's "unpatch" at the beginning of the notebook fixed it for me (thanks!).
Oh I commented out my overriding of accelerate - weirdly Xformers and Transformers work now - I added the patch because inference weirdly broke, so I had to add a try except inside of accelerate - hopefully DPO and ORPO work now! So sorry on the issue!
For local machines, please update Unsloth via:
pip uninstall unsloth -y
pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
@danielhanchen Thank you for this, the fix now works. Tested it extensively across my dpo code. No need of the "unpatch" from @kdunee anymore.
Thanks @danielhanchen!
When running the
ORPO Unsloth Example.ipynb
notebook, I encountered an error during the execution oforpo_trainer.train()
. The error occurs consistently across different GPU types and persists even with slightly older versions of unsloth and its dependencies.Steps to Reproduce
ORPO Unsloth Example.ipynb
notebookorpo_trainer.train()
Error Message
Environment
Additional Information
Possible Cause
The error seems to be related to the
send_to_device
function in theaccelerate
library, specifically when trying to move data to the GPU. It appears that somewhere in this process, the code is attempting to call a string object as if it were a function.Any assistance in resolving this issue would be greatly appreciated. Let me know if you need any additional information or if you'd like me to run any specific tests.