run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.75k stars 5.27k forks source link

[Question]: Dumb llama3 with llamaindex, but not with ollama #13419

Closed patrasq closed 6 months ago

patrasq commented 6 months ago

Question Validation

Question

Has anyone faced issues with llama3 with the ollama integration with llamaindex ? same prompt, no custom parameters, diferent responses from ollama run llama3 and llm.complete

it feels way more dumb when called through llm.complete

i want llama3 to fix a json, and with ollama run llama3 it returns a perfect json, while llm.complete messes the structure and returns only 10-20% of the string

dosubot[bot] commented 6 months ago

Hello @patrasq! 👋 I'm Dosu, a bot here to lend a hand with bugs, answer your queries, and help you get started with contributing while we wait for a human maintainer to join us. I'm diving into your issue about the differences you're observing between ollama run llama3 and llm.complete and will get back to you with a fuller answer shortly. Hang tight!

logan-markewich commented 6 months ago

If you can give a reproducible example, happy to take a look

logan-markewich commented 6 months ago

(@patrasq )

patrasq commented 6 months ago

@logan-markewich

Full json for copy/paste:

[{"product_name":"Nr. table interactive de min. 65 inch + suport","specs":[{"Diagonala":"minimum 65","unghi de vizibilitate":"178 de grade","Tip display":"4k UHD, anti-glare, luminozitate minimum 350 cd","Touchscreen":"in 20 de puncte cu cel putin 2 stylus incluse","Functii":"browser, prezentare wireless de pe orice dispozitiv, capturi de ecran adnotate","Boxe integrate":"Da","Sistem de operare":"preinstalat, cel mult o versiune in urma fata de ultima lansata, cu suport tehnic oferit de producator pentru o perioada de cel putin 4 ani","Suport pentru display interactiv":"de tip fix sau mobil"}]},{"product_name":"Nr. laptopuri","specs":[{"Procesor":"cel mult o generatie in urma fata de ultima lansata de producator, scor de minimum 5.000 de puncte pe site-ul cpubenchmark.net","Display":"minimum 14","Memorie RAM":"minimum 8 GB, DDR4,"Stocare":"tip SSD minimum 256 GB","Conectivitate":"wireless 802.11 ac, bluetooth 5,"Webcam":"integrat, rezolutie 1.280 x 720 p","Porturi":"HDMI, USB 3.0, audio jack combo","Greutate":"mai mica de 2 kg","Sistem de operare":"in functie de necesarul de licentiere al unitatii de invatamant"}]},{"product_name":"Nr. sisteme de sunet","specs":[{"Putere RMS(W)":"80,"Amplificare":"integrata","Conectivitate bluetooth":"4,1,"Conectivitate jack":"3.5 mm si/sau RCA"}]},{"product_name":"Nr. camere videoconferinta","specs":[{"Sunet":"difuzor integrat, full duplex cu anulare zgomot si ecou","Codare":"H.264,"Alte functionalitati":"telecomanda, pan, tilt, zoom, volume +-, ao mute, raspuns/inchis"}]},{"product_name":"Nr. imprimante multifunctionale","specs":[{"Imprimare, copiere, scanare":"Da","Viteza de imprimare":"minimum 12 ppm","Volum lunar recomandat":"3.000 de pagini","Duplex imprimare si scanare":"Da","Rezolutie (dpi)":"minimum 3.000 x 2.000,"Dimensiune scanare minima":"A4,"Format salvare":"png, jpg, pdf","Corectarea inclinarii":"Da","Inregistrare video":"Da"}]},{"product_name":"Nr. scannere documente portabile","specs":[{"Aplatizare automata":"Da","OCR, scanare duplex":"Da","Rezolutie (dpi)":"minimum 3.000 x 2.000,"Dimensiune scanare minima":"A4,"Format salvare":"png, jpg, pdf","Corectarea inclinarii":"Da","Inregistrare video":"Da"}]}]

C:\Users\CDS>ollama run llama3

>>> fix following json [{"product_name":"Nr. table interactive de min. 65 inch + suport","specs":[{"Diagonala":"minimum
... 65","unghi de vizibilitate":"178 de grade","Tip display":"4k UHD, anti-glare, luminozitate minimum 350 cd","Touchscr... een":"in 20 de puncte cu cel putin 2 stylus incluse","Functii":"browser, prezentare wireless de pe orice dispozitiv,...  capturi de ecran adnotate","Boxe integrate":"Da","Sistem de operare":"preinstalat, cel mult o versiune in urma fata...  de ultima lansata, cu suport tehnic oferit de producator pentru o perioada de cel putin 4 ani","Suport pentru displ... ay interactiv":"de tip fix sau mobil"}]},{"product_name":"Nr. laptopuri","specs":[{"Procesor":"cel mult o generatie
... in urma fata de ultima lansata de producator, scor de minimum 5.000 de puncte pe site-ul cpubenchmark.net","Display"... :"minimum 14","Memorie RAM":"minimum 8 GB, DDR4,"Stocare":"tip SSD minimum 256 GB","Conectivitate":"wireless 802.11
... ac, bluetooth 5,"Webcam":"integrat, rezolutie 1.280 x 720 p","Porturi":"HDMI, USB 3.0, audio jack combo","Greutate":... "mai mica de 2 kg","Sistem de operare":"in functie de necesarul de licentiere al unitatii de invatamant"}]},{"produc... t_name":"Nr. sisteme de sunet","specs":[{"Putere RMS(W)":"80,"Amplificare":"integrata","Conectivitate bluetooth":"4,... 1,"Conectivitate jack":"3.5 mm si/sau RCA"}]},{"product_name":"Nr. camere videoconferinta","specs":[{"Sunet":"difuzo... r integrat, full duplex cu anulare zgomot si ecou","Codare":"H.264,"Alte functionalitati":"telecomanda, pan, tilt, z... oom, volume +-, ao mute, raspuns/inchis"}]},{"product_name":"Nr. imprimante multifunctionale","specs":[{"Imprimare,
... copiere, scanare":"Da","Viteza de imprimare":"minimum 12 ppm","Volum lunar recomandat":"3.000 de pagini","Duplex imp... rimare si scanare":"Da","Rezolutie (dpi)":"minimum 3.000 x 2.000,"Dimensiune scanare minima":"A4,"Format salvare":"p... ng, jpg, pdf","Corectarea inclinarii":"Da","Inregistrare video":"Da"}]},{"product_name":"Nr. scannere documente port... abile","specs":[{"Aplatizare automata":"Da","OCR, scanare duplex":"Da","Rezolutie (dpi)":"minimum 3.000 x 2.000,"Dim... ensiune scanare minima":"A4,"Format salvare":"png, jpg, pdf","Corectarea inclinarii":"Da","Inregistrare video":"Da"}... ]}]

The JSON you provided appears to be a list of products with their specifications. Here's the corrected and
formatted version:

``
[
  {
    "product_name": "Nr. table interactive de min. 65 inch + suport",
    "specs": [
      {
        "Diagonala": "minimum 65",
        "Unghi de vizibilitate": "178 de grade",
        "Tip display": "4k UHD, anti-glare, luminozitate minimum 350 cd",
        "Touchscreen": "in 20 de puncte cu cel putin 2 stylus incluse",
        "Functii": "browser, prezentare wireless de pe orice dispozitiv, capturi de ecran adnotate",
        "Boxe integrate": "Da",
        "Sistem de operare": "preinstalat, cel mult o versiune in urma fata de ultima lansata, cu suport tehnic
oferit de producator pentru o perioada de cel putin 4 ani",
        "Suport pentru display interactiv": "de tip fix sau mobil"
      }
    ]
  },
  {
    "product_name": "Nr. laptopuri",
    "specs": [
      {
        "Procesor": "cel mult o generatie in urma fata de ultima lansata de producator, scor de minimum 5.000 de
puncte pe site-ul cpubenchmark.net",
        "Display": "minimum 14",
        "Memorie RAM": "minimum 8 GB, DDR4",
        "Stocare": "tip SSD minimum 256 GB",
        "Conectivitate": "wireless 802.11 ac, bluetooth 5",
        "Webcam": "integrat, rezolutie 1.280 x 720 p",
        "Porturi": "HDMI, USB 3.0, audio jack combo",
        "Greutate": "mai mica de 2 kg",
        "Sistem de operare": "in functie de necesarul de licentiere al unitatii de invatamant"
      }
    ]
  },
  {
    "product_name": "Nr. sisteme de sunet",
    "specs": [
      {
        "Putere RMS(W)": "80",
        "Amplificare": "integrata",
        "Conectivitate bluetooth": "4,1",
        "Conectivitate jack": "3.5 mm si/sau RCA"
      }
    ]
  },
  {
    "product_name": "Nr. camere videoconferinta",
    "specs": [
      {
        "Sunet": "difuzor integrat, full duplex cu anulare zgomot si ecou",
        "Codare": "H.264",
        "Alte functionalitati": "telecomanda, pan, tilt, zoom, volume +/- , ao mute, raspuns/inchis"
      }
    ]
  },
  {
    "product_name": "Nr. imprimante multifunctionale",
    "specs": [
      {
        "Imprimare, copiere, scanare": "Da",
        "Viteza de imprimare": "minimum 12 ppm",
        "Volum lunar recomandat": "3.000 de pagini",
        "Duplex imprimare si scanare": "Da",
        "Rezolutie (dpi)": "minimum 3.000 x 2.000",
        "Dimensiune scanare minima": "A4",
        "Format salvare": "png, jpg, pdf",
        "Corectarea inclinarii": "Da",
        "Inregistrare video": "Da"
      }
    ]
  },
  {
    "product_name": "Nr. scannere documente portabile",
    "specs": [
      {
        "Aplatizare automata": "Da",
        "OCR, scanare duplex": "Da",
        "Rezolutie (dpi)": "minimum 3.000 x 2.000",
        "Dimensiune scanare minima": "A4",
        "Format salvare": "png, jpg, pdf",
        "Corectarea inclinarii": "Da",
        "Inregistrare video": "Da"
      }
    ]
  }
]
``

Please note that I've kept the original formatting and translation of your data. If you'd like me to make any changes or corrections, feel free to let me know!

Llamaindex:

llm_engine = LLMEngine(
    collection_name,
    LLMConfig(
        model="llama3"
    ),
    final_directory,
    delete=False,
    rag=False
)
fixed_json = llm_engine.json_fixer(response)

@dataclass
class LLMConfig:
    model: str
    request_timeout: int = 360 

class LLMEngine:
    def __init__(self, collection_name, llm_config, directory, delete=False, rag=True):
        if not isinstance(llm_config, LLMConfig):
            raise TypeError("llm_config must be an instance of LLMConfig")

        if llm_config.model == "llama3" or llm_config.model == "phi3":
            Settings.llm = Ollama(
                model=llm_config.model,
                request_timeout=llm_config.request_timeout
            )
            Settings.embed_model = "local:BAAI/bge-small-en-v1.5"

outputs:

The provided JSON is not well-formed. There are several issues:

1. The values of the "product_name" and "specs" keys are strings, but they should be objects.
2. The "specs" key has an array value, but each element in the array should be an object with a single key-value pair.

Here's the corrected JSON:
``
[
  {
    "product_name": {
      "en": "Masa de laborator pentru 2 elevi"
    },
    "specs": [
      {
        "key": "Dimensiuni minime recomandate",
        "value": "1600 X 600 X 750 mm. cu cadru metalic, cu suprafata de lucru protejata, rezistenta la socuri termice, mecanice si i substante agresive, cu colturi rotunjite, cu protectie PVC pentru pardoseala"
      }
    ]
  },
  {
    "product_name": {
      "en": "Masa de laborator pentru profesor"
    },
    "specs": [
      {
        "key": "Dimensiuni minime recomandate",
        "value": "1500 X 600 X 800 mm. cu suprafata de lucru protejata, rezistenta la socuri termice, mecanice si substante agresive si cu spatii de depozitare"
      }
    ]
  },
  ...
]

Same behavior with other inputs.

dosubot[bot] commented 6 months ago

The differences you're observing between ollama run llama3 and llm.complete could stem from how each method handles JSON data, particularly if json_mode is enabled in the Ollama configuration. When json_mode is active, it ensures that all responses are valid JSON, which might explain why the output from ollama run llama3 appears more structured and complete compared to llm.complete. This mode is designed to make the output more JSON-compliant, potentially altering both the structure and content of the responses.

The llm.complete method, on the other hand, constructs a JSON payload with the necessary parameters for model invocation, including handling complex JSON structures and nested arrays. It sends this payload to the model's API endpoint and parses the JSON response back into Python objects. For streaming responses, it iterates over response chunks, parsing each chunk as it arrives. This method's design to handle complex JSON structures efficiently might also contribute to the discrepancies you're noticing in the output.

In summary, the key to understanding the differences in output lies in the specific configurations and handling of JSON data by each method. If you're looking for consistency in the output, you might want to explore adjusting the json_mode setting or closely examine how each method constructs and processes the JSON payloads.

#### Details **Sources** - [docs/docs/examples/llm/ollama.ipynb](https://github.com/run-llama/llama_index/blob/main/docs/docs/examples/llm/ollama.ipynb) - [llama-index-legacy/llama_index/legacy/llms/bedrock.py](https://github.com/run-llama/llama_index/blob/main/llama-index-legacy/llama_index/legacy/llms/bedrock.py) - [llama-index-legacy/llama_index/legacy/llms/vllm.py](https://github.com/run-llama/llama_index/blob/main/llama-index-legacy/llama_index/legacy/llms/vllm.py) - [llama-index-core/llama_index/core/llms/llm.py](https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/llms/llm.py) - [llama-index-integrations/llms/llama-index-llms-ollama/llama_index/llms/ollama/base.py](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/llms/llama-index-llms-ollama/llama_index/llms/ollama/base.py) - [llama-index-legacy/llama_index/legacy/llms/ollama.py](https://github.com/run-llama/llama_index/blob/main/llama-index-legacy/llama_index/legacy/llms/ollama.py) - [llama-index-legacy/llama_index/legacy/llms/watsonx.py](https://github.com/run-llama/llama_index/blob/main/llama-index-legacy/llama_index/legacy/llms/watsonx.py)
**About Dosu** This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

logan-markewich commented 6 months ago

I thiiiink this is a difference between calling llm.chat() vs. llm.complete() ?

When llama-index uses the Ollama llm, it will always use llm.chat(), but when using the LLM directly, we need to use it ourselves

Test script

from llama_index.llms.ollama import Ollama

llm = Ollama(model="llama3", request_timeout=3600)

json_str = """[{"product_name":"Nr. table interactive de min. 65 inch + suport","specs":[{"Diagonala":"minimum 65","unghi de vizibilitate":"178 de grade","Tip display":"4k UHD, anti-glare, luminozitate minimum 350 cd","Touchscreen":"in 20 de puncte cu cel putin 2 stylus incluse","Functii":"browser, prezentare wireless de pe orice dispozitiv, capturi de ecran adnotate","Boxe integrate":"Da","Sistem de operare":"preinstalat, cel mult o versiune in urma fata de ultima lansata, cu suport tehnic oferit de producator pentru o perioada de cel putin 4 ani","Suport pentru display interactiv":"de tip fix sau mobil"}]},{"product_name":"Nr. laptopuri","specs":[{"Procesor":"cel mult o generatie in urma fata de ultima lansata de producator, scor de minimum 5.000 de puncte pe site-ul cpubenchmark.net","Display":"minimum 14","Memorie RAM":"minimum 8 GB, DDR4,"Stocare":"tip SSD minimum 256 GB","Conectivitate":"wireless 802.11 ac, bluetooth 5,"Webcam":"integrat, rezolutie 1.280 x 720 p","Porturi":"HDMI, USB 3.0, audio jack combo","Greutate":"mai mica de 2 kg","Sistem de operare":"in functie de necesarul de licentiere al unitatii de invatamant"}]},{"product_name":"Nr. sisteme de sunet","specs":[{"Putere RMS(W)":"80,"Amplificare":"integrata","Conectivitate bluetooth":"4,1,"Conectivitate jack":"3.5 mm si/sau RCA"}]},{"product_name":"Nr. camere videoconferinta","specs":[{"Sunet":"difuzor integrat, full duplex cu anulare zgomot si ecou","Codare":"H.264,"Alte functionalitati":"telecomanda, pan, tilt, zoom, volume +-, ao mute, raspuns/inchis"}]},{"product_name":"Nr. imprimante multifunctionale","specs":[{"Imprimare, copiere, scanare":"Da","Viteza de imprimare":"minimum 12 ppm","Volum lunar recomandat":"3.000 de pagini","Duplex imprimare si scanare":"Da","Rezolutie (dpi)":"minimum 3.000 x 2.000,"Dimensiune scanare minima":"A4,"Format salvare":"png, jpg, pdf","Corectarea inclinarii":"Da","Inregistrare video":"Da"}]},{"product_name":"Nr. scannere documente portabile","specs":[{"Aplatizare automata":"Da","OCR, scanare duplex":"Da","Rezolutie (dpi)":"minimum 3.000 x 2.000,"Dimensiune scanare minima":"A4,"Format salvare":"png, jpg, pdf","Corectarea inclinarii":"Da","Inregistrare video":"Da"}]}]"""

prompt = "fix the following json " + json_str

response = llm.complete(prompt)
print(str(response))

print('#########################')
print('#########################')

from llama_index.core.llms import ChatMessage
response = llm.chat([ChatMessage(role="user", content=prompt)])
print(str(response))

Actually though, both responses seemed very good from the above script. I would make sure that a) you have the latest version of the Ollama server installed b) you have the latest Ollama package from llama-index pip install -U llama-index-llms-ollama