weaviate / weaviate

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database​.
https://weaviate.io/developers/weaviate/
BSD 3-Clause "New" or "Revised" License
10.39k stars 716 forks source link

autocut-llm module proposal #3973

Open CShorten opened 6 months ago

CShorten commented 6 months ago

What?

An LLM that looks at the first search result to determine if it is relevant to the query. If not, no results are presented to the user.

Why?

Broadly to avoid returning irrelevant results. Another maybe extreme example is, say you search "dalkfsl", this should return 0 results.

The current autocut implementation looks for the steepest descent in vector distances to determine the search result cutoff. This works fairly well when you have say 3-7 / 100 relevant results, but this doesn't work when the first result is not relevant.

How?

Similar to rerank and generate we will pass the first result to the model inference to determine if the result is relevant or not.

We can start with prompt engineering a zero-shot LLM to do this and then maybe fine-tuning a model (could be hosted on Anyscale or something we publish on HuggingFace, not sure yet). The key technical challenge will be enforcing the Structured Output Parsing, i.e. it needs to output "relevant" / "not relevant" or 1/0 for Weaviate to then parse the response.

It would probably be easiest to implement if we interface it with llm_autocut=True in Python, or with _additional in GraphQL:

{
  Get {
   PodClip(
     nearText: {
       concepts: ["How many years has LeBron James played in the NBA?"]
     }
   ){
   content
   speaker
   _additional {
     autocutLLM(
       property: ["content"]
     ){
     response
       }
     }
   }
  }
}

Code of Conduct

Emil-Sorensen commented 6 months ago

This would be super helpful!

stale[bot] commented 4 months ago

Thank you for your contribution to Weaviate. This issue has not received any activity in a while and has therefore been marked as stale. Stale issues will eventually be autoclosed. This does not mean that we are ruling out to work on this issue, but it most likely has not been prioritized high enough in the last months. If you believe that this issue should remain open, please leave a short reply. This lets us know that the issue is not abandoned and acts as a reminder for our team to consider prioritizing this again. Please also consider if you can make a contribution to help with the solution of this issue. If you are willing to contribute, but don't know where to start, please leave a quick message and we'll try to help you. Thank you, The Weaviate Team