elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.47k stars 8.04k forks source link

[Obs AI Assistant] Improve LLM re-rank consistency #180132

Open miltonhultgren opened 2 months ago

miltonhultgren commented 2 months ago

Summary

@LucaWintergerst has reported a odd behaviour of the LLM, where given a certain prompt and certain Knowledge base content it will sometimes rank a document high (7) and sometimes low (0).

We haven't yet been able to reproduce this but the issue persists across small tweaks to the prompt.

This has real consequences as it makes the recall step (and thus the following conversation) very unpredictable.

AC

Test data

Current system message:

Given the following question, score the documents that are relevant to the question. on a scale from 0 to 7,
    0 being completely irrelevant, and 7 being extremely relevant. Information is relevant to the question if it helps in
    answering the question. Judge it according to the following criteria:

    - The document is relevant to the question, and the rest of the conversation
    - The document has information relevant to the question that is not mentioned,
      or more detailed than what is available in the conversation
    - The document has a high amount of information relevant to the question compared to other documents
    - The document contains new information not mentioned before in the conversation

Prompt:

Is there a runbook for the cartservice-otel?

Knowledge base:

[
  {
    id: 'elastic/observability-aiops/ai_assistant/runbooks/slos/cartservice-runbook.md',
    text: '{"mode":"100644","path":"ai_assistant/runbooks/slos/cartservice-runbook.md","extension":".md","size":165,"name":"cartservice-runbook.md","id":"elastic/observability-aiops/ai_assistant/runbooks/slos/cartservice-runbook.md","type":"blob","body":"This is the runbook for the cartservice-otel error. If the cartservice is experiencing errors, do the following: - call Luca, he will fix it - grab a cup of coffee"}',
    score: 10
  }
]
elasticmachine commented 2 months ago

Pinging @elastic/obs-knowledge-team (Team:obs-knowledge)

miltonhultgren commented 2 months ago

If anyone has any idea about what might influence this behaviour so that we can try to reproduce it, that would be greatly helpful.

grabowskit commented 1 month ago

Need to find a way to reproduce.