liger Search Results - Githubissues

854 results
for liger

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

abacusai/gh200-llm #3

[Feature Request] - Liger Kernels on ARM64 for GH200?

Hi, @arvindsun I'm not sure if you've seen [this repository](https://github.com/linkedin/Liger-Kernel) from the folks at LinkedIn, but I was wondering on the off chance you did if you'd had any lu…

jlotthammer updated 3 months ago
2
hiyouga/LLaMA-Factory #5838

dpo qwen2-72b oom，9*8 A800 80G需要怎么设置？

### Reminder - [X] I have read the README and searched the existing issues. ### System Info ### model # model_name_or_path: /mnt/nas/shanzhi/eval_models/Qwen2-7B model_name_or_path: /mnt/nas/liya…

BobTsang1995 updated 1 month ago
1
linkedin/Liger-Kernel #249

ValueError when Loading Qwen2-VL Model with Liger Kernel

### 🐛 Describe the bug I'm encountering a ValueError when trying to load the Qwen2-VL model using the AutoLigerKernelForCausalLM class from the Liger Kernel. The error message indicates an unrecogn…

rahatarinasir updated 2 months ago
1
linkedin/Liger-Kernel #241

Reasons for upcasting the logits dtype outside the kernel

Hello, thank you for this great work. https://github.com/linkedin/Liger-Kernel/blob/acd82728207ebafad28d448640502c108901a967/src/liger_kernel/ops/fused_linear_cross_entropy.py#L69 https://github.c…

yzhangcs updated 2 months ago
7
2U1/Llama3.2-Vision-Finetune #13

Minimum GPU memory need for fine-tuning

What is the minimum single gpu needs for fine-tuning? Does Unsloth support for fine-tuning?

alaminkawsar updated 2 days ago
9
modelscope/ms-swift #2284

求助：同模型同数据集，单卡(A40，48GB)正常训练，多卡（4 * 3090，96GB）MP模式OOM，请大佬帮忙分析…

运行资源：模型：Qwen2.5-32B-Instruct 数据集：自定义数据集单卡运行脚本：微调方式：Qlora CUDA_VISIBLE_DEVICES=0 \ swift sft \ --model_type qwen2_5-32b-instruct \ --model_id_or_path /hy-tmp/model/Qwen/Qwen2.5-32B-I…

camposs1979 updated 3 weeks ago
1
linkedin/Liger-Kernel #48

Unable to use FLCE with FSDP+PEFT+embeddings layers

### 🐛 Describe the bug when trying to train both LoRA layers on the base model and also set modules_to_save on the lora config which makes the embeddings layers trainable (my assumption is it also ap…

winglian updated 1 month ago
5
linkedin/Liger-Kernel #231

Triton error on AMD GPUs

### 🐛 Describe the bug I'm trying to test this library on an HPC cluster with AMD MI250X GPUs, but I'm getting a weird seemingly Triton-related error specifically when I turn on `model.train()`. Th…

eminorhan updated 2 months ago
8
linkedin/Liger-Kernel #309

Empty Medusa head tensors

### 🐛 Describe the bug Tensors saved in `medusa_only_heads` mode are empty. Ref: https://github.com/linkedin/Liger-Kernel/blob/main/examples/medusa/train.py#L392 ### Reproduce _No response_ ### V…

vkc1vk updated 3 weeks ago
2
linkedin/Liger-Kernel #236

Benchmarking phi3 on single A100 40gb GPU: unable to reprodu…

### 🐛 Describe the bug I'm using `flyte` to reproduce the token throughput and memory savings results reported in this [repo's README](https://github.com/linkedin/Liger-Kernel?tab=readme-ov-file#su…

cosmicBboy updated 2 months ago
3

上一页 1...1 2 3 4 5 6 7...86 下一页

854 results for liger

854 results
for liger