Open ernleite opened 10 months ago
Do we need to source
bigdl-llm-init for QLoRA? @qiyuangong @hzjane
Do we need to
source
bigdl-llm-init for QLoRA? @qiyuangong @hzjane
I think it's ok, I'll add it to the readme file.
Hello I am trying to fine tune a LLama2 model
Actually the finetuning process is taking a very long time so I had to cancel it because it is using only one core in my machine (DELL R730 with 2 CPUS / 56 Logicals cores) I tried accelerate config but it is not working Any idea? Thanks !!
Maybe you can try source
bigdl-llm-init or just try taskset -c 0-27
to use more cores.
thanks for your reply I already did that.. It works but when it starts "converting the current model to sym_int4 format then all disapeer. Only one process remains. Is my server R730 compatible?
in fact python command never worked for me. it only works using llm-convert, llm-cli etc. very strange thanks
Please check your conda env based on https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/QLoRA-FineTuning .
I followed the configuration since the beggining. Thanks
thanks for your reply I already did that.. It works but when it starts "converting the current model to sym_int4 format then all disapeer. Only one process remains. Is my server R730 compatible?
in fact python command never worked for me. it only works using llm-convert, llm-cli etc. very strange thanks
Hi @ernleite, what do you mean python command never work? Have you tried taskset -c 0-27 to use more cores? Could you please share the commands for how you run this qlora fine tuning, we will try to check and reproduce it.
@glorysdj I meant all the command like c 0-X python ./generate.py or qlora_finetuning_cpu.py does not work for me. The only commands that work (using all cores in my machine) are llm-convert or llm-cli
my configuration DELL R730 with 2 CPUs 96 GB RAM Ubuntu 22.04 LTS
I would be so happy if this can work
here an unresolved issue I explained few weeks ago : [https://github.com/intel-analytics/BigDL/issues/8936]
thanks !
This screenshot showing only one core is used at a given time (100%)
@glorysdj I meant all the command like c 0-X python ./generate.py or qlora_finetuning_cpu.py does not work for me. The only commands that work (using all cores in my machine) are llm-convert or llm-cli
my configuration DELL R730 with 2 CPUs 96 GB RAM Ubuntu 22.04 LTS
I would be so happy if this can work
here an unresolved issue I explained few weeks ago : [https://github.com/[/issues/8936](https://github.com/intel-analytics/BigDL/issues/8936)]
thanks !
@ernleite - a quick question: are you able to run bigdl-llm using these python commands on your local PC (either windows or linux)?
I have a laptop running on Windows 11. Let me try. I will let you know.
@jason-dai I used my laptop The CPU version works fine with Windows 11 (even it took several hours). Good step then!
I have two GPUs in my laptop but was not able to use my Intel Iris Xe with 16GB I have an issue with the pytorch librairy
I tried many configurations but the Qlora GPU version does not work. Are we sure it works with python 3.9? The DLL is present but seems to not work. I don't know why? I installed the latest Intel GPU drivers & OneAPi too.
So my question is : does the GPU version works with Windows ? What is the equivalent in Windows for souce bigdl-init
thanks again
This screenshot showing only one core is used at a given time (100%)
@ernleite Do you have a GPU on your machine? I tried to reproduce the issue and found that after converting the current model to sym_int4 format the finetuning program ran on the GPU.
So you can try to disable GPU when you finetune on CPU, and make sure you use the CPU version of package bigdl-llm.
Hope this can help you.
So my question is : does the GPU version works with Windows ?
Currently it's not supported yet
In my side, blocked at the process 0%(>3h) with MTL RVP when running qlora_finetuning_cpu.py, cmd: python ./qlora_finetuning_cpu.py --repo-id-or-model-path llama-2-7b-hf --dataset english_quotes env: MTL RVP, 8(e)+6(p) core, 96G mem, ubuntu22.04 I have run "source bigdl-llm-init -t" Could you also help on that? thanks!
We fixed this issue(only use one core) last week. Related to this pr. When the CPU does not support bf16, qlora will automatically use only one core. You can try to use this cmd lscpu | grep bf16
to see if your CPU supports bf16 and caused by it. And You can use the latest qlora_finetuning_cpu.py
to run.
We fixed this issue(only use one core) last week. Related to this pr. When the CPU does not support bf16, qlora will automatically use only one core. You can try to use this cmd
lscpu | grep bf16
to see if your CPU supports bf16 and caused by it. And You can use the latestqlora_finetuning_cpu.py
to run.
Wow! amazing thanks.
I can confirm that it works really better.
For the moment, it only works on a CPU (I have 2) but maybe it is just a misconfiguration. I am deep diving on that now.
Hello I am trying to fine tune a LLama2 model
Actually the finetuning process is taking a very long time so I had to cancel it because it is using only one core in my machine (DELL R730 with 2 CPUS / 56 Logicals cores) I tried accelerate config but it is not working Any idea? Thanks !!