Open bhavyajoshi-mahindra opened 1 hour ago
Couple of points to add here.
1) While save the quantize model, the default name of the safetensors which is saved is "gptq_model-4bit-128g.safetensors" but the name which is expected by tensorflow is "model.safetensors".
2) There is no "model.safetensors.index.json" file saved while quantization of the custom Qwen2VL, I am not sure whether it is needed or not. I have used "use_safetensors=True" while saving the quantize model.
3) When I tried to update the transformers to 4.46.0 and tokenizer to 0.20.1. I got the same error.
4) I tried to rename the "gptq_model-4bit-128g.safetensors" to "model.safetensors". I got this error "Exception: data did not match any variant of untagged enum ModelWrapper at line 757378 column 3"
Hi, I have finetuned Qwen2-VL using Llama-Factory. I successfully quantized the fine-tuned model as given
But then I tried to infer the quantized custom Qwen2-VL model using this code...
I got this error: OSError: Error no file named model.safetensors found in directory /content/drive/MyDrive/LLM/vinplate2-gwen2-vl-gptq-4bit.
I am not sure what I did wrong. Please help me. I think its transformer version issue but not sure which version is correct.
My Enviroment: Linux (Google Colab) CUDA 12.2 Python 3.10.12 transformers 4.45.0.dev0 (pip install git+https://github.com/huggingface/transformers@21fac7abba2a37fae86106f87fcf9974fd1e3830 accelerate) torch 2.4.1+cu121 auto_gptq 0.6.0.dev0+cu1222 accelerate 1.0.1 ninja 1.11.1.1 tokenizers 0.19.1 flash_attn 2.6.3