-
So I have a GPTQ llama model I downloaded (from TheBloke), and it's already 4 bit quantized. I have to pass in False for the load_in_4bit parameter of:
```
model, tokenizer = FastLlamaModel.from_pr…
-
System Specs: 3080 10G with 32gb RAM
This is potentially an issue to low VRAM, and system not freeing up enough memory before next generation. But can't say for sure,
Logs for debugging:
```pow…
-
I'm trying to use FLUX dev model from CIVITAI on my Stable Diffusion Forge WEBUI, I'm using it because it can create NSFW images which nf4 model is not able create.
My laptop specs are gtx 1650 4gb, …
-
Really usefull extensions. On dev nf4 (RTX 4070, setting 9500 max vram for model) Great speedup if no model change between generation.
previously:
100%|████████████████████████████████████████████…
-
-
Hope I don't come off the wrong way, first I want to emphasize I am not complaining. I appreciate all the work done here and am enjoying it on my Windows NVIDiA machine. But as someone who also has a …
-
when using kolors ...
!!! Exception during processing !!! local variable 'app_face' referenced before assignment
Traceback (most recent call last):
File "D:\ComfyUI-aki-v1.2\execution.py", li…
-
I tried to load the sample code, and use fp8 models,
`model_manager = ModelManager(torch_dtype=torch.float8_e4m3fn, device="cuda")
`
but it still fails with `TypeError: couldn't find storage objec…
-
when using fp16 flux + fp16 t5xxl, generating works just fine but when it reaches the vae decoding stage, it allocates like 10 gb of my ram, overflowing the ram into the ssd and making the pc super un…
-
![image](https://github.com/artidoro/qlora/assets/32769358/1c7b2b84-fe78-4cd9-a92a-10a190aade40)
could you provide the example code of blip2 finetune with 4bit NF4 ?