unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
17.24k stars 1.19k forks source link

AMD unsloth/kernels/rms_layernorm.py":22:0): error: unsupported target: 'gfx906' > RuntimeError: PassManager::run failed #1160

Open unclemusclez opened 1 day ago

unclemusclez commented 1 day ago

My GPU is a gfx906.

I will try this again on my gfx1100

INFO     | 2024-10-21 13:03:40 | autotrain.trainers.clm.train_clm_sft:train:39 - creating trainer
Generating train split: 4267 examples [00:16, 258.45 examples/s]
/usr/local/open-webui/.venv/lib/python3.12/site-packages/trl/trainer/sft_trainer.py:401: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `SFTTrainer.__init__`. Use `processing_class` instead.
  super().__init__(
/bin/sh: 1: nvidia-smi: not found
==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 4,267 | Num Epochs = 3
O^O/ \_/ \    Batch size per device = 8 | Gradient Accumulation steps = 15
\        /    Total batch size = 120 | Total steps = 105
 "-____-"     Number of trainable parameters = 18,464,768
/bin/sh: 1: nvidia-smi: not found
INFO     | 2024-10-21 13:04:00 | autotrain.trainers.common:on_train_begin:386 - Starting to train...
  0%|                                                                                                                                                                                                                | 0/105 [00:00<?, ?it/s]loc("/usr/local/open-webui/.venv/lib/python3.12/site-packages/unsloth/kernels/rms_layernorm.py":22:0): error: unsupported target: 'gfx906'
ERROR    | 2024-10-21 13:04:03 | autotrain.trainers.common:wrapper:215 - train has failed due to an exception: Traceback (most recent call last):
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/autotrain/trainers/common.py", line 212, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/autotrain/trainers/clm/__main__.py", line 28, in train
    train_sft(config)
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/autotrain/trainers/clm/train_clm_sft.py", line 55, in train
    trainer.train()
  File "<string>", line 155, in train
  File "<string>", line 368, in _fast_inner_training_loop
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 3565, in training_step
    loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/unsloth/models/_utils.py", line 1169, in _unsloth_pre_compute_loss
    return self._old_compute_loss(model, inputs, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 3615, in compute_loss
    outputs = model(**inputs)
              ^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/accelerate/utils/operations.py", line 820, in forward
    return model_forward(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/accelerate/utils/operations.py", line 808, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/torch/amp/autocast_mode.py", line 44, in decorate_autocast
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/torch/_compile.py", line 32, in inner
    return disable_fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 632, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/unsloth/models/llama.py", line 1044, in PeftModelForCausalLM_fast_forward
    return self.base_model(
           ^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/peft/tuners/tuners_utils.py", line 197, in forward
    return self.model.forward(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/unsloth/models/llama.py", line 942, in _CausalLM_fast_forward
    outputs = self.model(
              ^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/unsloth/models/llama.py", line 776, in LlamaModel_fast_forward
    hidden_states = Unsloth_Offloaded_Gradient_Checkpointer.apply(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/torch/autograd/function.py", line 575, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/torch/amp/autocast_mode.py", line 465, in decorate_fwd
    return fwd(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/unsloth/models/_utils.py", line 793, in forward
    output = forward_function(hidden_states, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/unsloth/models/llama.py", line 490, in LlamaDecoderLayer_fast_forward
    hidden_states = fast_rms_layernorm(self.input_layernorm, hidden_states)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/unsloth/kernels/rms_layernorm.py", line 192, in fast_rms_layernorm
    out = Fast_RMS_Layernorm.apply(X, W, eps, gemma)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/torch/autograd/function.py", line 575, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/unsloth/kernels/rms_layernorm.py", line 144, in forward
    fx[(n_rows,)](
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/triton/runtime/jit.py", line 345, in <lambda>
    return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/triton/runtime/jit.py", line 662, in run
    kernel = self.compile(
             ^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/triton/compiler/compiler.py", line 282, in compile
    next_module = compile_ir(module, metadata)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/triton/backends/amd/compiler.py", line 255, in <lambda>
    stages["llir"] = lambda src, metadata: self.make_llir(src, metadata, options)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/triton/backends/amd/compiler.py", line 186, in make_llir
    pm.run(mod)
RuntimeError: PassManager::run failed

ERROR    | 2024-10-21 13:04:03 | autotrain.trainers.common:wrapper:216 - PassManager::run failed
  0%|                                                                                                                                                                                                                | 0/105 [00:03<?, ?it/s]
INFO     | 2024-10-21 13:04:17 | autotrain.parser:run:239 - Job ID: 27552
darkacorn commented 1 day ago

normal - unsloth does not support amd for the time beeing - we'd love if you want to contribute a implementation for it

unclemusclez commented 1 day ago

i am keeping this open to document the progress... i believe i had this running.