unsloth Search Results - Githubissues

1000+ results
for unsloth

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

unslothai/unsloth #1240

why is unsloth thinking I'm doing multi gpu optimization whe…

code ```python ''' conda activate beyond_scale_2_unsloth ''' import torch from datasets import load_dataset from trl import SFTConfig, SFTTrainer from unsloth import FastLanguageModel from tr…

brando90 updated 1 week ago
3
unslothai/unsloth #1282

Gradient norm is zero for training Qwen2.5-0.5B-Instruct in …

Hi, I encountered an issue after updating to unsloth=="2024.11.6". When training the `Qwen2.5-0.5B-Instruct` model without PEFT, I observed that the model's gradient norm is 0, resulting in no weig…

joe32140 updated 4 days ago
1
unslothai/unsloth #1258

ValueError: Unsloth: Untrained tokens of [[128004]] found

I tried to write a custom trainer based on SFTTrainer and trained with unsloth. The code snippets is: ``` # customize from SFTTrainer class CustomTrainer(SFTTrainer): def compute_loss(self,…

Hyfred updated 2 days ago
2
unslothai/unsloth #1181

NameError: name 'Unpack' is not defined

Here is the code it gives .... EVERY NOTEBOOK I tried. --------------------------------------------------------------------------- NameError Traceback (most recent…

CurtiusSimplus updated 2 weeks ago
12
unslothai/unsloth #1241

how to only do lora on the lm_head?

I want to do only training (lora is fine) for the head of the network, how do I do that? I get this error: ```bash (beyond_scale_2_unsloth) brando9@ampere1~/beyond-scale-2-alignment-coeff $ python /…

brando90 updated 1 week ago
3
arcee-ai/mergekit #446

KeyError model[0] did not exist in tensor?

I am performing a Mega Merge using LLaMA 3.2 3B, both the base model and fine-tuning/instruction tuning, with the DARE linear method. Following the successful completion of the initial merge, I encoun…

FrozzDay updated 5 days ago
1
modelscope/ms-swift #2284

求助：同模型同数据集，单卡(A40，48GB)正常训练，多卡（4 * 3090，96GB）MP模式OOM，请大佬帮忙分析…

运行资源：模型：Qwen2.5-32B-Instruct 数据集：自定义数据集单卡运行脚本：微调方式：Qlora CUDA_VISIBLE_DEVICES=0 \ swift sft \ --model_type qwen2_5-32b-instruct \ --model_id_or_path /hy-tmp/model/Qwen/Qwen2.5-32B-I…

camposs1979 updated 1 week ago
1
unslothai/unsloth #1198

Mistral Instruct v3 `sentencepiece_model.proto` error

Unsloth: Merging 4bit and LoRA weights to 16bit... Unsloth: Will use up to 23.73 out of 50.99 RAM for saving. 100%|██████████| 32/32 [00:19 4 if True: model.push_to_hub_gguf("mINE", tokenizer, quant…

CurtiusSimplus updated 6 days ago
25
axolotl-ai-cloud/axolotl #908

Apply unsloth optimizations

### ⚠️ Please check that this feature request hasn't been suggested before. - [X] I searched previous [Ideas in Discussions](https://github.com/OpenAccess-AI-Collective/axolotl/discussions/categories…

bratao updated 6 months ago
12
unslothai/unsloth #1179

Can't import unsloth when both the latest version of unsloth…

To repro: Install the latest versions of unsloth and transformers ``` !pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unslot…

lossflow updated 3 weeks ago
7

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for unsloth

1000+ results
for unsloth