PermissionError: [WinError 5]

Description I am encountering a PermissionError: [WinError 5] while trying to run my training script using the LLaMA Factory library(encountered the same problem on origin unsloth finetune.py) on Windows. This error occurs during model training, and I have verified several potential causes without success.

Environment OS: Windows 10/11 Python Version: [Your Python version, e.g., 3.11.4] LLaMA Factory Version: 0.8.3.dev0 PyTorch Version: 2.2.2+cu121 CUDA Version: [Your CUDA version, e.g., 12.1] Other Relevant Libraries: torchvision: 0.17.2 transformers: 4.44.0 triton: 2.1.0

Steps to Reproduce Set up the environment as described above. and pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git" (unsloth version)

Troubleshooting Steps Taken Verified that I have full control permissions for the project directory. Checked and confirmed that all environment variables are set correctly. Ensured that no other processes are using the GPU (checked with nvidia-smi). Attempted to run the script as an administrator. Temporarily disabled antivirus and firewall software. Created a new virtual environment and reinstalled all required packages. Tried moving the project to a different directory Tested running the script directly from the command line instead of an IDE. Checked for other Python processes running in Task Manager and ended them. Uninstall Conda. Use python 3.11.9 instead.

The same issue about unsloth and windows is here（url：https://github.com/hiyouga/LLaMA-Factory/issues/2990）

Traceback： PS C:\Users\SuperBoy\Unsloth-Windows-fineTuning-Qwen2> & E:/py311/python.exe c:/Users/SuperBoy/Unsloth-Windows-fineTuning-Qwen2/fine-tuning.py 🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning. [2024-08-13 15:33:14,489] [INFO] [real_accelerator.py:191:get_accelerator] Setting dsaccelerator to cuda (auto detect) [2024-08-13 15:33:15,159] torch.distributed.elastic.multiprocessing.redirects: [WARNING] NOTE: Redirects are currently not supported in Windows or MacOs. ==((====))== Unsloth 2024.8: Fast Qwen2 patching. Transformers = 4.44.0. \ /| GPU: NVIDIA GeForce RTX 3070 Laptop GPU. Max memory: 8.0 GB. Platform = Windows. O^O/ \/ \ Pytorch: 2.2.2+cu121. CUDA = 8.6. CUDA Toolkit = 12.1. \ / Bfloat16 = TRUE. FA [Xformers = 0.0.25.post1. FA2 = False] "--" Free Apache license: http://github.com/unslothai/unsloth E:\py311\Lib\site-packages\accelerate\utils\modeling.py:825: UserWarning: expandablesegments not supported on this platform (Triggered internally at ..\c10/cuda/CUDAAllocatorConfig.h:30.) = torch.tensor([0], device=i) Unsloth 2024.8 patched 28 layers with 0 QKV layers, 28 O layers and 28 MLP layers. max_steps is given, it will override any value given in num_trainepochs ==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1 \ /| Num examples = 541 | Num Epochs = 2 O^O/ \/ \ Batch size per device = 2 | Gradient Accumulation steps = 4 \ / Total batch size = 8 | Total steps = 70 "--" Number of trainable parameters = 18,464,768 0%| | 0/70 [00:00<?, ?it/s]ptxas info : 11 bytes gmem ptxas info : Compiling entry function '_rms_layernorm_forward_0d1de2d3de4d5c6d7c8de9' for 'sm_86' ptxas info : Function properties for _rms_layernorm_forward_0d1de2d3de4d5c6d7c8de9 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads ptxas info : Used 28 registers, 408 bytes cmem[0] Traceback (most recent call last): File "c:\Users\SuperBoy\Unsloth-Windows-fineTuning-Qwen2\fine-tuning.py", line 84, in trainer.train() File "", line 126, in train File "", line 363, in _fast_inner_training_loop File "E:\py311\Lib\site-packages\transformers\trainer.py", line 3328, in training_step loss = self.compute_loss(model, inputs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\transformers\trainer.py", line 3373, in compute_loss outputs = model(inputs) ^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl return forward_call(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\accelerate\utils\operations.py", line 822, in forward return model_forward(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\accelerate\utils\operations.py", line 810, in call return convert_to_fp32(self.model_forward(*args, *kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch\amp\autocast_mode.py", line 16, in decorate_autocast return func(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch_compile.py", line 24, in inner return torch._dynamo.disable(fn, recursive)(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch_dynamo\eval_frame.py", line 489, in _fn return fn(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\unsloth\models\llama.py", line 978, in PeftModelForCausalLM_fast_forward return self.base_model( ^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl return forward_call(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\peft\tuners\tuners_utils.py", line 188, in forward return self.model.forward(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\accelerate\hooks.py", line 166, in new_forward output = module._old_forward(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\unsloth\models\llama.py", line 897, in _CausalLM_fast_forward outputs = self.model( ^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl return forward_call(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\accelerate\hooks.py", line 166, in new_forward output = module._old_forward(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\unsloth\models\llama.py", line 753, in LlamaModel_fast_forward layer_outputs = torch.utils.checkpoint.checkpoint( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch_compile.py", line 24, in inner return torch._dynamo.disable(fn, recursive)(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch_dynamo\eval_frame.py", line 489, in _fn return fn(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch_dynamo\external_utils.py", line 17, in inner return fn(args, kwargs) ^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch\utils\checkpoint.py", line 482, in checkpoint return CheckpointFunction.apply(function, preserve, args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch\autograd\function.py", line 553, in apply return super().apply(args, kwargs) # type: ignore[misc] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch\utils\checkpoint.py", line 261, in forward outputs = run_function(args) ^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\unsloth\models\llama.py", line 749, in custom_forward return module(inputs, past_key_value, output_attentions, padding_mask = padding_mask) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl return forward_call(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\accelerate\hooks.py", line 166, in new_forward output = module._old_forward(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\unsloth\models\llama.py", line 466, in LlamaDecoderLayer_fast_forward hidden_states = fast_rms_layernorm(self.input_layernorm, hidden_states) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\unsloth\kernels\rms_layernorm.py", line 190, in fast_rms_layernorm out = Fast_RMS_Layernorm.apply(X, W, eps, gemma) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\torch\autograd\function.py", line 553, in apply return super().apply(*args, *kwargs) # type: ignore[misc] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\unsloth\kernels\rms_layernorm.py", line 144, in forward fx[(n_rows,)]( File "E:\py311\Lib\site-packages\triton\runtime\jit.py", line 541, in run self.cache[device][key] = compile( ^^^^^^^^ File "E:\py311\Lib\site-packages\triton\compiler\compiler.py", line 202, in compile so_path = backend.make_launcher_stub(src, metadata) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\triton\compiler\backends\cuda.py", line 224, in make_launcher_stub return make_stub(src.name, src.signature, constants, ids, enable_warp_specialization=enable_warp_specialization) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\triton\compiler\make_launcher.py", line 37, in make_stub so = _build(name, src_path, tmpdir) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\site-packages\triton\common\build.py", line 124, in _build ret = subprocess.check_call(cc_cmd) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\subprocess.py", line 408, in check_call retcode = call(popenargs, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\subprocess.py", line 389, in call with Popen(*popenargs, **kwargs) as p: ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\py311\Lib\subprocess.py", line 1026, in init self._execute_child(args, executable, preexec_fn, close_fds, File "E:\py311\Lib\subprocess.py", line 1538, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ PermissionError: [WinError 5] 拒绝访问。（which means Access is denied.）

unslothai / unsloth

PermissionError: [WinError 5] #914