-
do you planing to do this for diffusers ,,, here the setup that made for bnb-NF4
https://github.com/huggingface/diffusers/issues/9165
i think you can remove bnb-nf4 stuff and add gguf
its featur…
-
Over on Huggingface you all have your Native bitsandbytes 4bit pre quantized models: https://huggingface.co/collections/unsloth/load-4bit-models-4x-faster-659042e3a41c3cbad582e734 It's really awesome…
-
## Description
We observe no improvement with PaddingFree on QLoRA and GPTQ-LoRA when running benchmarks on OrcaMath.
However,
- additionally applying FOAK along with PaddingFree shows signif…
-
### Main Objectives/Goals:
The primary objective is to generate higher returns by using certain products on the BNB Chain. This initiative aims to enhance returns for users while attracting a substan…
-
new version of transfomer, no need to use BetterTransformer, try setting attn impl to sdpa...
attn imp:
Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs are not used …
-
Hello,
I am trying to benchmark a model quantized to nf4 with bnb. How can I run it with the vLLM backend without getting a BrokenPipeError? Also, how can I utilize both GPUs of my machine?
Thank yo…
-
From: https://bscscan.com/tx/0xedd714e43389420d52f01d4ce3f65f84c0702a1dc8b27f4055b1402011bb5e11
CCTX: https://explorer.zetachain.com/cc/tx/0xd1a9da7eb54fe3d157ba139cc547dcc699dec7496c59df978cb85056…
-
Right now, when we finetune a LoRA on top of e.g. Llama 3.1 8B instruct, even if model_name is `meta-llama/Meta-Llama-3.1-8B-Instruct`, it gets resolved to `unsloth/meta-llama-3.1-8b-instruct-bnb-4bit…
-
Hi, I am trying to run the `Llama-3.1 8b + Unsloth 2x faster finetuning.ipynb` you provided in the README. However, when I use google colab to run the second cell I got this error:
``` bash
------…
-
Hi, I'm trying to fine-tune the Llama3.1 8b model but after fine-tuning it uploading it to HF, and when trying to run it using vLLM I get this error "KeyError: 'base_model.model.model.layers.0.mlp.dow…