-
hey, while running on 4bit quantized model from https://huggingface.co/ThetaCursed/Ovis1.6-Gemma2-9B-bnb-4bit i am getting the following error
```
{
"name": "RuntimeError",
"message": "self an…
-
Good day
After load saved lora model, i save it to merged. And after load it from merged, i have generation like '+++++ 1000000000000000000000000000000000000000000000000…
-
As mentioned in title.
-
I tryed to modify your example code to run this model on lowvram card by BNB 4bit or 8bit quantization config.
While use bnb 4bit config like below:
```python
qnt_config = BitsAndBytesConfig(load…
-
Highest capability models that can run on latest iPhone would be useful. The best I've found to fit in 8gb RAM is Qwen 2.5 7B 4bit?
-
After quantizing mlx-community/miqumaid-v3-70b with this command `mlx_lm.convert --hf-path miqumaid-v3-70b --mlx-path miqumaid-v3-70b-4bit -q --qbits 4`.
The model miqumaid-v3-70b-4bit cannot be infe…
-
How many gpus are needed to finetune? I have tried 16 PPUs (96GB each) but got CUDA OUT OF MEMROY
-
ive tred to run argmaxinc/mlx-stable-diffusion-3.5-large-4bit-quantized with no conda but using venv instead
I am using pinned Python 3.10 and Pytorch 2.4.0 versions to avoid compatibility issues.
…
-
Code:
```
from unsloth import FastLanguageModel
import torch
max_seq_length = 16384 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Te…
-
This is happening when I try to load any '-bnb-4bit' but not for instance 'unsloth/Meta-Llama-3.1-8B'.
No error shown in terminal.
`model, tokenizer = FastLanguageModel.from_pretrained(
…