Open giuliabaldini opened 1 week ago
Will re-investigate this - apologies on the delay!
Btw just thinking out loud (or thinking as written text)
Should we consolidate all these multi GPU errors into a single function? rn I see there's check_nvidia and the other part of code in from_pretrained
.
@Datta0, yeah, I definitely agree. However I am not incredibly familiar with patching functions this way, wouldn't the function have to be part of all the patched code, meaning that we have to rewrite it every time?
I tried deleting the check code in tokenizer_utils.py
and llama.py
, but I’m still getting the following error:
Traceback (most recent call last):
File "/home/fdf/dpo_finetune.py", line 116, in <module>
main()
File "/home/fdf/dpo_finetune.py", line 108, in main
trainer.train()
File "<string>", line 40, in train
RuntimeError: Unsloth currently does not support multi GPU setups - but we are working on it!
However, I don’t know which line in unsloth
triggered this error, so I can’t proceed to delete the check code further.
Hi guys, I have been a bit busy. I can submit version with all the fixes on either thursday or friday, have a hectic schedule until then.
Hi @Peter-Fy, did you try to install unsloth from this PR branch? Do you still get the error?
Hi @Peter-Fy, did you try to install unsloth from this PR branch? Do you still get the error?
Yes, I install unsloth from this PR branch, but I still get the error like:
Traceback (most recent call last):
File "/home/fdf/qlora_finetune.py", line 133, in <module>
main()
File "/home/fdf/qlora_finetune.py", line 125, in main
trainer.train()
File "<string>", line 39, in train
RuntimeError: tokenizer_utils.py:971 Unsloth currently does not support multi GPU setups - but we are working on it!
So I delete the check code in tokenizer_utils.py:971
, but I get another error like:
Traceback (most recent call last):
File "/home/fdf/qlora_finetune.py", line 133, in <module>
main()
File "/home/fdf/qlora_finetune.py", line 125, in main
trainer.train()
File "<string>", line 40, in train
RuntimeError: Unsloth currently does not support multi GPU setups - but we are working on it!
Hi guys, I have been a bit busy. I can submit version with all the fixes on either thursday or friday, have a hectic schedule until then.
That will be helpful, looking forward to your fixes.
Hi there,
this PR has the changes requested in #974. I unfortunately don't have a system where I can test this myself, but I have been testing it with other people on a cluster that has multiple GPUs.
The only problem is that I think that the fix at llama.py:1694 does not seem to work, as we are still getting the error. So to make it run we have actually removed this check. Any ideas of how to fix that? Is it problematic to remove that check there?
@hife-ai @Datta0 @Sehyo