nf4 still unsupported? - Githubissues

intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc

Apache License 2.0

6.75k stars 1.27k forks source link

In the example: example/CPU/QLoRA-FineTuning/qlora_finetuning_cpu.py

It mentions on a comment that nf4 is not supported on cpu yet but when I change the example from int4 -> nf4 it still runs without errors or warnings related to nf4.

Is nf4 now supported? Otherwise if it is defaulting back to int4 I think it's worth printing an error or warning.

bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_use_double_quant=False,
        bnb_4bit_quant_type="int4",  # nf4 not supported on cpu yet
        bnb_4bit_compute_dtype=torch.bfloat16
    )

intel-analytics / ipex-llm

nf4 still unsupported? #12427