ggerganov / llama.cpp

LLM inference in C/C++
MIT License
65.74k stars 9.44k forks source link

Question: why llama.cpp mobilevlm model(fp16) inference result is different with official pytorch project results, this is normal? #7614

Closed lijianxing123 closed 2 months ago

lijianxing123 commented 4 months ago

Prerequisites

Background Description

llama.cpp run cmd: ./llava-cli -m /mnt/nas_data2/wb_space/MobileVLMV2/MobileVLM_V2-1.7B_bk/ggml-model-f32.gguf --mmproj /mnt/nas_data2/wb_space/MobileVLMV2/MobileVLM_V2-1.7B_bk/mmproj-model-f16.gguf --image /mnt/nas_data2/wb_space/MobileVLMV2/assets/samples/demo.jpg -p "please describe this images." --temp 0 --top-p 1 -c 4096 llama.cpp result: The image is a digital art piece that captures the essence of history through its depiction. It features an illustration from "The Story Of World History" by Susan Wise Bauer, Revised Edition: Volume II - From Rome to Middle Ages (Volume 2)


official pytorch project params: model_path = './MobileVLM_V2-1.7B' image_file = "assets/samples/demo.jpg" prompt_str = "please describe this images." args = type('Args', (), { "model_path": model_path,

"image_file": image_file ,

"image_file": i, "prompt": prompt_str, "conv_mode": "v1", "temperature": 0, "top_p": None, "num_beams": 1, "max_new_tokens": 512, "load_8bit": False, "load_4bit": False, })()

inference_once(args)

official pytorch project results: 🚀 MobileVLM_V2-1.7B: The image is a vivid depiction of the cover of a book titled "The Story of the World: History for the Classical Child, Vol. 2: The Middle Ages, Volume 2: The Fall of Rome to the Rise of the Normans (Revised Edition)". The cover art is a captivating illustration of a knight on horseback, armed with a bow and arrow, poised for battle. The title of the book, "The Story of the World: History for the Classical Child, Vol. 2: The Middle Ages, Volume 2: The Fall of Rome to the Rise of the Normans", is prominently displayed in large, bold letters at the top of the cover. The author's name, Susan Wise Bauer, is also visible, indicating her authorship of the book. The overall design of the cover suggests a theme of adventure and exploration, fitting for a book about history.

Possible Answer

No response

JohannesGaessler commented 4 months ago

It is expected that the results will not be bit-fot-bit identical when you change the inference code because that will cause the floating point rounding error to be different and neural networks are very sensitive said rounding error.

lijianxing123 commented 4 months ago

Is there any way to avoid it?

JohannesGaessler commented 4 months ago

No.

lijianxing123 commented 4 months ago

Can we use llama.cpp for training model?

JohannesGaessler commented 4 months ago

As of right now, no.

RaghavYadav04 commented 4 months ago

The model used in llama.cpp is in GGUF format, while the PyTorch project uses a standard model file. Ensure that the conversion process between formats did not introduce any discrepancies.

lijianxing123 commented 3 months ago

The model used in llama.cpp is in GGUF format, while the PyTorch project uses a standard model file. Ensure that the conversion process between formats did not introduce any discrepancies.

thank you, and how to check or ensure there is no discrepancies

github-actions[bot] commented 2 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.