ggerganov / llama.cpp

LLM inference in C/C++
MIT License
64.58k stars 9.24k forks source link

Bug: llama-server + LLava 1.6 hallucinates #8001

Closed farnazj closed 1 month ago

farnazj commented 2 months ago

What happened?

When using ./llama-llava-cli, I get perfectly fine descriptions of images. But when hosting LLava with ./llama-server, LLava hallucinates big time.

Here's how I'm running LLava with the cli: ./llama-llava-cli -m models/llava-v1.6-vicuna-7b.Q5_K_S.gguf --mmproj models/mmproj-model-f16.gguf --image images/sth.jpeg -c 4096

Here's how I'm starting the server: ./llama-server -m models/llava-v1.6-vicuna-7b.Q5_K_S.gguf --mmproj models/mmproj-model-f16.gguf -c 2048 --host 127.0.0.1 --port 8000

Here's the python code to send the request:

import requests
import base64

def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')

base64_image = encode_image("./images/sth.png")

headers = {
    'Content-Type': 'application/json',
}

json_data = {
    'image_data': [{
        'data': base64_image, 
        'id': 10
    }],
    "prompt": "USER:[img-10]Describe the image.\nASSISTANT:",
    "temperature": 0.1
}

response = requests.post('http://127.0.0.1:8000/completion', headers=headers, json=json_data)
print(response.json()["content"])

Name and Version

./llama-cli --version
version: 3173 (a94e6ff8)
built with Apple clang version 15.0.0 (clang-1500.3.9.4) for arm64-apple-darwin23.5.0

What operating system are you seeing the problem on?

Mac

Relevant log output

No response

ngxson commented 2 months ago

Multimodal is currently not supported on server. The model will generate the response without looking at the image (so it hallucinates)

Related to https://github.com/ggerganov/llama.cpp/issues/8010

farnazj commented 2 months ago

Oh no :( what is the latest stable release/commit that still supports the multimodal?

github-actions[bot] commented 1 month ago

This issue was closed because it has been inactive for 14 days since being marked as stale.