Open jn2clark opened 1 year ago
very early draft of branch https://github.com/marqo-ai/marqo/compare/mainline...gpt_reranker1.
We can use GPT3 to rerank the results of a Marqo retrieval. We can benefit from the fast retrieval of tensor and lexical search, coupled with LLM reranking models' slower, but potentially more accurate, reranking ability.
This will also pave the way for alternative text-based model rerankers, and 2nd stage text retrieval processing. For example, text summarisation.
Marqo-GPT3 integration: GPT3 is now available as a Marqo reranker. Retrieve your documents via tensor search, and re-order them with GPT3 for better retrieval.
A new optional parameter is added to the search method:
response = mq.index("my-test-index").search(
"What plants have great heat resilience?", searchable_attributes=["Description", "Title"],
reranker="openai/GPT3/rerank",
reranker_api_key="<...api key...>"
)
A reranker_api_key
parameter is added as an optional paramater:
We create an elif checking for a GPT reranker in reranking.rerank_search_results()
Create a file src/marqo/s2_inference/reranking/openai/gpt3.py
which will interface with the GPT3 API.
Note that the format (k, v pair) of the reranker model in the cache can differ from the existing models. This can lay the path towards a refactor of the cache. This is OK because reranker models and embedding models aren't used the same way in the code. We do need to be careful about the model and device info endpoints. Tests will need to be created to ensure that they work as intended for the reranker models.
This will have an entrypoint function rerank(search_results: dict, query: str) -> dict
The prompt, also defined in this file, will have a structure like this:
f"""
Background:
{"\n".join["Source {i+1}) {content}" for i, content in enumerate(results)])}
Query: {question}
Instruction: rank the sources with respect to relevance of the query
"""
The results from GPT look like this:
1) Source 4: The 15 Best Heat Tolerant Plants (Because the Dog Days of Summer Are Here!)
2) Source 1: 20 Heat-Tolerant Plants That Will Survive (and Thrive) in the Summer
3) Source 3: Top 10 Heat-Tolerant Plants
4) Source 2: 7 Heat-Tolerant Plants that Love the Sun
5) Source 0: Plants That Like Full Sun and Heat
We parse the results to get the original indexes, which will be used to order the results.
RerankerError
and forward the message from GPT3, Make rerank_search_results()
operate on a copy of tensor_search.search()
's results
rerank_search_results()
OWL
and text
rerankers, this copy is returned (as this copy is mutated by the rerankers)rerank
function is returned directlyA new optional parameter is added to the search method:
response = mq.index("my-test-index").search(
"What plants have great heat resilience?", searchable_attributes=["Description", "Title"],
reranker="openai/GPT3/rerank",
reranker_properties={
"api_key": <... your api key here...>
}
)
An optional dict called reranker_properties
of replaces the api_key
parameter in the search endpoint body, tensor_search.search(), reranking.rerank_search_results().
The benefit of this approach is that we don't have to keep adding new parameters to the search endpoint body, tensor_search.search(), and py-marqo.search() signatures to add extra reranking parameters. We would only need to handle the new parameters in reranking.rerank_search_results().
The API, however, becomes more nested.
Can add reranker models to the cache in a different way to embedding models
Rather than re-ordering the existing documents, attach a new document (e.g.: reranker_output). The user can parse it on their end.
Create a mini DSL:
instruction = "summarise the following sources and cite them:
SOURCES:{}
SUMMARY:"
Move towards reranking on multiple fields
Is your feature request related to a problem? Please describe. Have the ability to use a LLM as a reranker in Marqo and allowing for augmented retrieval. Currently this is not supported. This can allow for using the LLM as a reranker ("rank these based on X...") or summarisation and to cite the original work.
Describe the solution you'd like when searching have the option to add a LLM as a reranker. Initially it should support hosted endpoints for the LLMs.
client.index(index_name).search("why is the sky blue", reranker={"name":'GPT3', 'api_key':###, 'engine':'davinci-003', prompt = "Given the query and the context, summarise and cite the references. QUERY:{}"}
It could allow for a free text prompt or some could be curated for predefined tasks, e.g.task="summarise"
Describe alternatives you've considered
Additional context An example here https://github.com/marqo-ai/marqo/tree/mainline/examples/GPT3NewsSummary this feature would bring the functionality to within Marqo.