georgesung / llm_qlora

Fine-tuning LLMs using QLoRA
MIT License
231 stars 52 forks source link

cuda out of memory error #1

Open benam2 opened 1 year ago

benam2 commented 1 year ago

Hey, thanks for sharing your code!

I'm trying to fine-tune the model on my dataset, but I keep getting an out-of-memory error, even though I have 24 gigabytes of RAM. Any ideas on what might be causing this problem?

Thank You!

georgesung commented 1 year ago

Are you using python 3.7? I noticed with the latest release of transformers and peft there are incompatibility issues if you use python 3.7. I updated the README with a troubleshooting section to address this issue, copied below. You can either use the work-around below, or use python 3.8+

Issues with python 3.7

If you're using python 3.7, you will install transformers 4.30.x, since transformers >=4.31.0 no longer supports python 3.7. If you then install the latest version of peft, the GPU memory consumption will be higher than usual. The work-around is to use an older version of peft to go along with the older transformers version you installed. Update your requirements.txt as follows:

transformers==4.30.2
git+https://github.com/huggingface/peft.git@86290e9660d24ef0d0cedcf57710da249dd1f2f4

Of course, make sure to remove the original lines with transformers and peft, and run pip install -r requirements.txt

bhuvneshsaini commented 9 months ago

I am doing fine-tune on kaggle and using python3.10 and used transformers==4.30.2 git+https://github.com/huggingface/peft.git@86290e9660d24ef0d0cedcf57710da249dd1f2f4

But still not able to start the training

Unexpected exception formatting exception. Falling back to standard exception Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3508, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "/tmp/ipykernel_42/3053083764.py", line 29, in trainer.load_base_model() File "/kaggle/input/llama-finetune1/llm_qlora-main/QloraTrainer.py", line 34, in load_base_model model = LlamaForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map={"":0}) File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3694, in from_pretrained Positions of the first token for the labeled span. File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 786, in _load_state_dict_into_meta_model File "/opt/conda/lib/python3.10/site-packages/transformers/integrations/bitsandbytes.py", line 108, in set_module_quantized_tensor_to_device RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 2105, in showtraceback stb = self.InteractiveTB.structured_traceback( File "/opt/conda/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1428, in structured_traceback return FormattedTB.structured_traceback( File "/opt/conda/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1319, in structured_traceback return VerboseTB.structured_traceback( File "/opt/conda/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1172, in structured_traceback formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context, File "/opt/conda/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1087, in format_exception_as_a_whole frames.append(self.format_record(record)) File "/opt/conda/lib/python3.10/site-packages/IPython/core/ultratb.py", line 969, in format_record frame_info.lines, Colors, self.has_colors, lvals File "/opt/conda/lib/python3.10/site-packages/IPython/core/ultratb.py", line 792, in lines return self._sd.lines File "/opt/conda/lib/python3.10/site-packages/stack_data/utils.py", line 144, in cached_property_wrapper value = obj.dict[self.func.name] = self.func(obj) File "/opt/conda/lib/python3.10/site-packages/stack_data/core.py", line 734, in lines pieces = self.included_pieces File "/opt/conda/lib/python3.10/site-packages/stack_data/utils.py", line 144, in cached_property_wrapper value = obj.dict[self.func.name] = self.func(obj) File "/opt/conda/lib/python3.10/site-packages/stack_data/core.py", line 681, in included_pieces pos = scope_pieces.index(self.executing_piece) File "/opt/conda/lib/python3.10/site-packages/stack_data/utils.py", line 144, in cached_property_wrapper value = obj.dict[self.func.name] = self.func(obj) File "/opt/conda/lib/python3.10/site-packages/stack_data/core.py", line 660, in executing_piece return only( File "/opt/conda/lib/python3.10/site-packages/executing/executing.py", line 190, in only raise NotOneValueFound('Expected one value, found 0') executing.executing.NotOneValueFound: Expected one value, found 0