End-to-end example for search with cross-encoder using API

neuml / txtai

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

https://neuml.github.io/txtai

Apache License 2.0

9.31k stars 598 forks source link

End-to-end example for search with cross-encoder using API #452

Closed dominikstein-ibm closed 1 year ago

dominikstein-ibm commented 1 year ago

Hi all,

I'm implementing a semantic search demo with txtai and want to make use of both, vector index with embeddings (msmarco-distilbert-base-v4) and a cross-encoder/re-ranker based on sentence-bert (cross-encoder/ms-marco-MiniLM-L-12-v2) to get best possible results. I am unsure whether I doing everything the right way. Please confirm.

I am using the API and build my own Docker container. in my config.yml i have:

embeddings: 
  path: sentence-transformers/msmarco-distilbert-base-v4
  content: True 
  objects: True

and

similarity: 
  path: cross-encoder/ms-marco-MiniLM-L-12-v2`

I am using API endpoint

`http://localhost:8000/search/`

In order to re-rank the results from the embeddings pipeline, do I need to create and call a workflow

crossrankedsearch:
        tasks:
            - action: embeddings
            - action: similarity

or does the /search endpoint cover that part automatically? If not, is my workflow correct?

Thanks in advance!

davidmezzetti commented 1 year ago

I've had really good results as of late with the E5 series of models - https://huggingface.co/intfloat/e5-base. Better than rerankers/crossencoders. Might be worth a try for your demo.

Nonetheless, if you'd like to have a consolidated search + rerank workflow action, it can be done with the following.

Create this file in the Docker image available on PYTHONPATH.

class Ranker:
    def __init__(self, application):
        self.application = application
        self.crossencoder = self.application.pipelines["crossencoder"]

    def __call__(self, queries, limit=10):
        # Embeddings search
        results = self.application.batchsearch(queries, limit)

        # Iterate over each result set and rerank
        output = []
        for i, result in enumerate(results):
            # Re-rank results
            rerank = []
            for uid, score in self.crossencoder(queries[i], (x["text"] for x in result)):
                row = result[uid]
                rerank.append({"id": row["id"], "text": row["text"], "score": score})

            # Store re-rankings
            output.append(rerank)

        return output

Then create the workflow file:

embeddings: 
  path: sentence-transformers/msmarco-distilbert-base-v4
  content: True 
  objects: True 

crossencoder: 
  path: cross-encoder/ms-marco-MiniLM-L-12-v2

ranker.Ranker:
  application: 

workflow:
  crossrankedsearch:
    tasks:
       - action: ranker.Ranker

dominikstein-ibm commented 1 year ago

Hi David, thanks for the response! Now I'm interested in the e5 model. From the documentation, to use it one needs to prepend "query:" and "passage:" at inference time. Did you use the 5 model with txtai? If so, how did you configure txtai to use it?

davidmezzetti commented 1 year ago

The e5 model is used in the txtai-wikipedia index - https://huggingface.co/NeuML/txtai-wikipedia/blob/main/config.json

You can add an instruction section to the configuration.

  "instructions": {
    "query": "query: ",
    "data": "passage: "
  }

More info can be found here - https://neuml.github.io/txtai/embeddings/configuration/#instructions

davidmezzetti commented 1 year ago

Closing due to inactivity. Re-open or open a new issue if there are further questions.