Model training issue with unsloth/mistral-7b-instruct-v0.2-bnb-4bit

Hi,

I successfully installed the unsloth library by following the instructions in #210. However, I encountered an issue when running trainer_stats = trainer.train() inside my VS Code virtual environment.

BTW Microsoft has moved the Universal Windows Platform (UWP) build tools into the Windows Application Development build tools, so the UWP option is no longer available in the workload selection. I referred to the screenshot provided in #210 and manually selected the necessary tools through the Visual Studio Installer.

Here is my system information:

OS: Windows 11
Torch Version: 2.3.1+cu118
CUDA: 11.8
Video Graphics Card: RTX 3070 Ti
Python Version: 3.11
Transformers: 4.42.3
Triton: 2.1.0
DeepSpeed: 0.13.1+unknown
BitsAndBytes: 0.43.1
Unsloth: 2024.7
XFormers: 0.0.26.post1

Here is the detailed traceback error message:

trainer_stats = trainer.train() [2024-07-11 09:09:22,563] [INFO] [real_accelerator.py:191:get_accelerator] Setting dsaccelerator to cuda (auto detect) W0711 09:09:22.918000 2900 torch\distributed\elastic\multiprocessing\redirects.py:27] NOTE: Redirects are currently not supported in Windows or MacOs. ==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1 \ /| Num examples = 200 | Num Epochs = 3 O^O/ \/ \ Batch size per device = 2 | Gradient Accumulation steps = 4 \ / Total batch size = 8 | Total steps = 60 "-____-" Number of trainable parameters = 41,943,040 0%| | 0/60 [00:00<?, ?it/s]ptxas info : 11 bytes gmem ptxas info : Compiling entry function '_rms_layernorm_forward_0d1de2d3de4d5c6d7c8de9' for 'sm_86' ptxas info : Function properties for _rms_layernorm_forward_0d1de2d3de4d5c6d7c8de9 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads ptxas info : Used 40 registers, 408 bytes cmem[0] Microsoft (R) C/C++ Optimizing Compiler Version 19.29.30154 for x64 Copyright (C) Microsoft Corporation. All rights reserved.

cl : Command line warning D9035 : option 'o' has been deprecated and will be removed in a future release cl : Command line warning D9002 : ignoring unknown option '-O3' cl : Command line warning D9002 : ignoring unknown option '-shared' cl : Command line warning D9002 : ignoring unknown option '-LC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\lib\x64' cl : Command line warning D9002 : ignoring unknown option '-LC:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64qbz5n2kfra8p0\libs' cl : Command line warning D9002 : ignoring unknown option '-lcuda' main.c C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\triton\common..\third_party\cuda\include\cuda.h(55): fatal error C1083: Cannot open include file: 'stdlib.h': No such file or directory Traceback (most recent call last): File "", line 1, in File "", line 126, in train File "", line 358, in _fast_inner_training_loop File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\transformers\trainer.py", line 3307, in training_step loss = self.compute_loss(model, inputs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\transformers\trainer.py", line 3338, in compute_loss outputs = model(inputs) ^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\accelerate\utils\operations.py", line 819, in forward return model_forward(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\accelerate\utils\operations.py", line 807, in call return convert_to_fp32(self.model_forward(*args, kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\torch\amp\autocast_mode.py", line 16, in decorate_autocast return func(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\unsloth\models\llama.py", line 940, in PeftModelForCausalLM_fast_forward return self.base_model( ^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\peft\tuners\tuners_utils.py", line 179, in forward return self.model.forward(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\accelerate\hooks.py", line 169, in new_forward output = module._old_forward(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\unsloth\models\mistral.py", line 216, in MistralForCausalLM_fast_forward outputs = self.model( ^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\accelerate\hooks.py", line 169, in new_forward output = module._old_forward(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\unsloth\models\llama.py", line 696, in LlamaModel_fast_forward hidden_states = Unsloth_Offloaded_Gradient_Checkpointer.apply( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\torch\autograd\function.py", line 598, in apply return super().apply(*args, kwargs) # type: ignore[misc] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\torch\cuda\amp\autocast_mode.py", line 115, in decorate_fwd return fwd(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\unsloth\models_utils.py", line 524, in forward output = forward_function(hidden_states, args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\accelerate\hooks.py", line 169, in new_forward output = module._old_forward(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\unsloth\models\llama.py", line 453, in LlamaDecoderLayer_fast_forward hidden_states = fast_rms_layernorm(self.input_layernorm, hidden_states) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\unsloth\kernels\rms_layernorm.py", line 190, in fast_rms_layernorm out = Fast_RMS_Layernorm.apply(X, W, eps, gemma) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\torch\autograd\function.py", line 598, in apply return super().apply(args, **kwargs) # type: ignore[misc] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\unsloth\kernels\rms_layernorm.py", line 144, in forward fx[(n_rows,)]( File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\triton\runtime\jit.py", line 541, in run self.cache[device][key] = compile( ^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\triton\compiler\compiler.py", line 202, in compile so_path = backend.make_launcher_stub(src, metadata) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\triton\compiler\backends\cuda.py", line 224, in make_launcher_stub return make_stub(src.name, src.signature, constants, ids, enable_warp_specialization=enable_warp_specialization) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\triton\compiler\make_launcher.py", line 37, in make_stub so = _build(name, src_path, tmpdir) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ScoreProjectTesting\hellucination_main..venv\Lib\site-packages\triton\common\build.py", line 124, in _build ret = subprocess.check_call(cc_cmd) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64qbz5n2kfra8p0\Lib\subprocess.py", line 413, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.29.30133\bin\Hostx64\x64\cl.exe', 'C:\Users\41769\AppData\Local\Temp\tmpjnruqn6w\main.c', '-O3', '-shared', '-IC:\ScoreProjectTesting\hellucination_main\..venv\Lib\site-packages\triton\common\..\third_party\cuda\include', '-IC:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64qbz5n2kfra8p0\Include', '-IC:\Users\41769\AppData\Local\Temp\tmpjnruqn6w', '-LC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\lib\x64', '-LC:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64qbz5n2kfra8p0\libs', '-lcuda', '-o', 'C:\Users\41769\AppData\Local\Temp\tmpjnruqn6w\_rms_layernorm_forward.cp311-win_amd64.pyd']' returned non-zero exit status 2.

Any suggestion for me to fix this error will be great! Thanks

unslothai / unsloth

Model training issue with unsloth/mistral-7b-instruct-v0.2-bnb-4bit #760