-
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, B…
-
When loading the model [unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit](https://huggingface.co/unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit), I have the error : `AssertionError: model.safetensors.index.…
-
I am encountering an issue with evaluating Bitsandbytes 4-bit and 8-bit quantized models on the Berkeley Function Call Leaderboard (BFCL). I have successfully quantized my models using Bitsandbytes an…
-
I try to upload a `safetensor` to hf without success for `unsloth/Llama-3.2-3B-Instruct` via an [example](https://colab.research.google.com/drive/1T5-zKWM_5OD21QHwXHiV9ixTRR7k3iB9?usp=sharing)
```pyt…
-
Hello , in the https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int4/blob/c34a4a91629f09f73a285f32dbd26106b033c654/config.json#L29 has mentioned the groupsize is 128 for 4bit or 8bit. So could y…
-
Hi all, I am trying to fine-tune models in extremely long contexts.
I've tested the training setup below, and I managed to finetune:
- llama3.1-1B with a max_sequence_length of 128 * 1024 tokens
…
-
Using the [official Llama 3.2 colab](https://colab.research.google.com/drive/1T5-zKWM_5OD21QHwXHiV9ixTRR7k3iB9?usp=sharing) linked from the [unsloth project homepage](https://github.com/unslothai/unsl…
-
### System Info
pytorch 2.2 and 2.4 are tested.
transformers 4.46.2
4 * A6000 ada
### Who can help?
@muellerzr
### Information
- [X] The official example scripts
- [ ] My own modified script…
-
mlx-community/Phi-3.5-vision-instruct-4bit fails to load in the current MLX build of LM Studio.
````
🥲 Failed to load the model
Failed to load model
Error when loading model: ValueError: Lo…
-
I'm on mlx-lm v0.19.1.
Running the following command with 4bit produced a bug where it would just generate the full 1000 max-tokens and just repeat the last two paragraphs over and over.
```bash…