elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.48k stars 8.04k forks source link

[Discuss] Knowledge Bases across the Kibana Platform #184468

Open spong opened 1 month ago

spong commented 1 month ago

Summary

This is an open discuss issue for ideating around Knowledge Bases across the Kibana Platform. I will try as best as I can to capture terminology, current implementations, and future use cases, but please feel free to correct me or add additional details. 🙂

Note that this is mostly from the perspective of a GenAI Solutions dev working on assistants and LLM-backed features within Kibana.


Definitions

First, let's align on what a Knowledge Base is and isn't -- roughly based around how they've already been implemented in Kibana. We'll shorten to KB from here on.

Generally, a KB is a collection of either user or domain specific documents that may be relevant in providing inference for a given user or feature prompt. If context is the LLM's short-term memory, the KB could be considered its long-term memory. KB content can be considered anything from some of the user's favorite things, workflow preferences (always query this specific index for alerts), or slack conversations, to domain specific documents like our ES|QL Documentation, our Elastic Security Labs Content, or the entirety of our ES/Kibana Documentation, and so forth.

While KB content can be thought of as long-term memory for LLM's, it is not the only long-term memory that is retrieved in RAG applications. For instance, Detection Alerts, APM Traces, or even raw logs could be retrieved to augment the context for the desired task, however for the purpose of this discussion, we will not consider it KB content. KB content often contains embeddings and metadata to aid in retrieval, which may not typically be the case for raw logs or log derivative data like alerts and traces, but this can vary based on use case.

To put it more concretely, a KB is a data stream or index of documents that contain some metadata, and raw content along with its embeddings. Retrieval of this content should be able to be controlled by document level security, or some form of basic RBAC based on the current user to ensure proper scoping of KB content. Not all users can modify all KB data, and some KB data could be considered 'system' data, in that it is stack versioned and provided by default (think Kibana Documentation). KB data is usually tied to the model that created its embeddings. All current KB implementations within Kibana (O11y/Security Assistants, Search Playground) use ELSER as this model, but we should probably think about supporting the generic retrieval of documents that have been embedded using other models for wider flexibility. Perhaps there's some crossover here with the _inference API.


Usage

Generally speaking, KB content is leveraged in RAG applications in two main ways. The first is to prime the initial context with some relevant KB content either based on a similarity search of the input prompt (or re-written prompt), or based on some static filter like required:true such that documents are always included. In this manner, the KB content behaves like 'custom user instruction', as static content that would always be included in the context. The second way is to be exposed as a tool/function/retriever to the LLM, to let it know what type of content it has access to, and to decide when to query for it.

When providing tools, it may be beneficial to provide multiple 'scoped' tools for accessing the knowledge base as opposed to a single generic KB retriever. Building RAG applications is solving the search problem at multiple different intersections. We can leverage ES for ranking and relevance, but we should leverage the LLM for inference of intent. For instance, if we had just a single KB retrieval tool, and the user asks "what are my open github issues", while 'relevant' results might be returned, the context could still end up being polluted with spurious KB entries like github bot slack messages. In this case, having a dedicated tool for the Github KB content with a description "query this KB for the users github issues and pull requests", we can use LLM inference as the first 'filter' of retrieval.

Note: this area is still rapidly developing, and there are many other methods that could be used here. When building agent graphs, you could have an agent that compiles and maintains a registry of tools by querying and annotating your KB datasets. Or perhaps you only registry very specific tools that map 1:1 to specific user tasks, but each of these tools has access to certain KB retrievers.


Implementation Details

Currently, there are two KB 'implementations' within Kibana:

The O11y Assistant, which can be configured to retrieve content from either the .kibana-observability-ai-assistant-kb index, or any installed Search Connectors. During recall, content from both sources are combined, sorted, stuffed into the token budget, and re-ranked using the LLM (internal slack). The KB index exists globally, and RBAC is managed via namespace, user, and public fields (component template). Content is embedded via ELSER+ingest pipeline. Currently the only 'system' KB content provided by default is the lens documentation. Custom KB content can be added either via the Stack Mgmt 'AI Assistants' Settings UI, or by indexing directly into the KB index (docs). There is a private CRUD API /internal/observability_ai_assistant/kb/ that powers the settings UI.

The Security Assistant, which is configured to retrieve content from the .kibana-elastic-ai-assistant-kb index, which currently only supports retrieving our ES|QL documentation which is included as 'system' KB content. Content is embedded via ELSER+ingest pipeline.

In 8.15, we are expanding the KB functionality by leveraging the AssistantDataClient architecture, which creates a space-aware data stream .kibana-elastic-ai-assistant-knowledge-base-default, and will provide a bulk CRUD API for KB content management. RBAC is managed via users array (component template) and Kibana Feature Privileges (similar to centralized anonymization), and we will be trialing the new semantic_text field for embedding/chunking (while continuing to support content embedded via ingest pipelines). A similar recall strategy to O11y will be used, where required:true documents are always included in the initial context, and a general KB tool will be registered for inline retrieval and saving of KB content. Differently though, an 'index' KB entry type will be introduced for support of 'search connectors' and generic KB indices/data streams. This KB entry type will take an index name, search field, and 'custom instructions' text field that will be used to generate a tool per KB entry. This will also provide an avenue for 'advanced configuration' if wanting to override the search/ranking strategy used for that specific index.


Looking Forward

For the best user and developer experience, it might be ideal if we could streamline some parts of the definition, management, and interoperability of Knowledge Bases across the Kibana Platform. This is still a rapidly developing space, and so may be difficult to align on API's or functionality, but hopefully we can use this discuss issue to orient around core concepts and march towards some sort of platform-wide interoperability where it makes sense.

Knowledge Base Integrations

Earlier this year I put together a proposal for Knowledge Base Integrations, which would be a way of delivering KB content via fleet integrations. This idea is appealing for many reasons. The first in that it provides the ability for users to generate integrations that behave like 'Custom GPTs', and manage them externally to the stack. The second is in our ability to provide stack-versioned 'system' KB content, like our documentation, so that it can be used by any consumer in the stack, not requiring individual solutions to manage content delivery, versioning, re-embedding, etc. Please see the issue for all the details.

I have a POC of delivering our ES|QL Documentation as a KB Integration and would like to get an experimental version in place for a near-term release (~8.16). The hope here is that we can use this as a test bed for aligning interoperability of generic KB retrieval, and see where things go from there.

spong commented 1 month ago

One aspect that I would appreciate some additional commentary around would be how Saved Objects fit into all of this... Today we could do a simple tool/retriever with a user scoped SO client to do a keyword search against titles or descriptions, but it would be nice if we started embedding some fields on SO's automatically if ELSER is deployed. I think I recall a PoC that embedded the description of Dashboard SO's for a better vector search experience across 1000's of dashboards, but not sure if anything came of it. I'll try to track that down...

Either way, whether most SO types like Dashboards, Visualizations, Cases, Timelines, Notes, etc are considered KB content or not (I would think so), users should be able to retrieve and ask questions about this content. So it would be nice to discuss the current capabilities of the SO client, and how perhaps the new semantic_text field might be able to provide embeddings for common/useful fields if ELSER is deployed.

dgieselaar commented 1 month ago

I am very ++ on the idea of having a knowledge base service. It seems like a great candidate to extract out some common functionality for both assistants. One thing re: search connectors:

Differently though, an 'index' KB entry type will be introduced for support of 'search connectors' and generic KB indices/data streams. This KB entry type will take an index name, search field, and 'custom instructions' text field that will be used to generate a tool per KB entry. This will also provide an avenue for 'advanced configuration' if wanting to override the search/ranking strategy used for that specific index.

We have changed our implementation for 8.15: we query the connectors metadata index to find all the indices where search content is being written to, and then query those. The user can override this with an advanced setting.

spong commented 1 month ago

We have changed our implementation for 8.15: we query the connectors metadata index to find all the indices where search content is being written to, and then query those. The user can override this with an advanced setting.

Nice! I thought I saw some rumblings from @sorenlouv about this -- this is good stuff. 🙂 Will they exist as individual or 'toggleable' KB entries in the current table, or are 'index-backed knowledge bases' a special case separate from manual entries that'll stay in the Search Connectors tab you have in settings?

I ask this as I'm thinking about how Knowledge Base Integrations would integrate into this experience. Right now I'm thinking if you install an Integration that has KB content, it automatically shows up in the KB table (just like you do with the lens content now), and the user would be able to enable/disable it, and maybe change some advanced settings, like the tool prompt(s), or retrieval strategies. Seems like this might work well for Search Connectors (which I think are getting marketed as integrations now too, right?). That makes for a nice UX out of the box where content you set your stack up with automatically starts becoming available within assistant (or any more general GenAI) experiences, but can be further configured/disabled if desired.

sorenlouv commented 1 month ago

Will they exist as individual or 'toggleable' KB entries in the current table, or are 'index-backed knowledge bases' a special case separate from manual entries that'll stay in the Search Connectors tab you have in settings?

No, content from search connectors won't show up in the knowledge base table. Only entries from the Observability Assistant Knowledge base are visible. But it's a good point: Right now we have no UI for showing entries from other "knowledge bases" than our own and instead point users to the search connector ui. It's a good idea to rethink this.

There's a third kind of knowledge base that I don't see mentioned: custom indices where customers choose to index vector data. With the addition of the Advanced setting that @dgieselaar mentioned, it is now possible for users to also make that content available to the Observability Assistant.