-
### Feature request
Support for DBRX Instruct model in bitsandbytes
### Motivation
DBRX Instruct is supposed to be the best open LLM model, but the 132B makes it unusable for most. I tried this
…
-
Hi, thanks for your amzaing job. I found the code using NF4 quantization by default, but don't add any support to switch UQ. If I have a model quantized by GPTQ, how to use LoftQ on it?
I have trie…
-
![image](https://github.com/user-attachments/assets/4393b355-fa36-4d1d-ac79-4ec1ed58821f)
There seems to be no control, is there a bug?
-
-
### System Info
bnbconfig with Mistral-v1 is not quantized into 4 bit even though I used load_in_4bit=True
### Reproduction
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb…
-
### Your question
I got:
Total VRAM 8188 MB, total RAM 16011 MB
pytorch version: 2.3.1+cu121
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4060 Laptop GPU : cudaMallocAsync
…
-
There is a new promising model.
https://huggingface.co/docs/diffusers/main/en/api/pipelines/flux
It would be good to take this model pipeline from diffusers package and add a canvas and attention …
-
how to support flux lora, no work.
It is there easy way to support normal flux lora to nf4
"Error occurred when executing SamplerCustomAdvanced:
.to() does not accept copy argument
File "F:\Co…
-
Currently, for every llama SKU, we have 6 configs:
- LoRA single device
- LoRA distributed
- QLoRA single device
- QLoRA distributed (after FSDP2)
- Full distributed
- Full single device
We d…
-
The `bigdl-llm 2.4.0b20231006` generates outputs normally. Not sure if this issue is caused by PR[#9066](https://github.com/intel-analytics/BigDL/pull/9066)
### ENV
bigdl-llm: **the main branch** (2…