-
Hi, @arvindsun
I'm not sure if you've seen [this repository](https://github.com/linkedin/Liger-Kernel) from the folks at LinkedIn, but I was wondering on the off chance you did if you'd had any lu…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
### model
# model_name_or_path: /mnt/nas/shanzhi/eval_models/Qwen2-7B
model_name_or_path: /mnt/nas/liya…
-
### 🐛 Describe the bug
I'm encountering a ValueError when trying to load the Qwen2-VL model using the AutoLigerKernelForCausalLM class from the Liger Kernel. The error message indicates an unrecogn…
-
Hello, thank you for this great work.
https://github.com/linkedin/Liger-Kernel/blob/acd82728207ebafad28d448640502c108901a967/src/liger_kernel/ops/fused_linear_cross_entropy.py#L69
https://github.c…
-
What is the minimum single gpu needs for fine-tuning? Does Unsloth support for fine-tuning?
-
运行资源:
模型:Qwen2.5-32B-Instruct
数据集:自定义数据集
单卡运行脚本:
微调方式:Qlora
CUDA_VISIBLE_DEVICES=0 \
swift sft \
--model_type qwen2_5-32b-instruct \
--model_id_or_path /hy-tmp/model/Qwen/Qwen2.5-32B-I…
-
### 🐛 Describe the bug
when trying to train both LoRA layers on the base model and also set modules_to_save on the lora config which makes the embeddings layers trainable (my assumption is it also ap…
-
### 🐛 Describe the bug
I'm trying to test this library on an HPC cluster with AMD MI250X GPUs, but I'm getting a weird seemingly Triton-related error specifically when I turn on `model.train()`. Th…
-
### 🐛 Describe the bug
Tensors saved in `medusa_only_heads` mode are empty.
Ref: https://github.com/linkedin/Liger-Kernel/blob/main/examples/medusa/train.py#L392
### Reproduce
_No response_
### V…
-
### 🐛 Describe the bug
I'm using `flyte` to reproduce the token throughput and memory savings results reported in this [repo's README](https://github.com/linkedin/Liger-Kernel?tab=readme-ov-file#su…