michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models and clip
https://michaelfeil.github.io/infinity/
MIT License
1.25k stars 86 forks source link

BAAI/bge-reranker-v2-gemma #333

Open Jimmy-Newtron opened 1 month ago

Jimmy-Newtron commented 1 month ago

Model description

The model is loaded but strangely it has capabilities of embedding and no rerank

{
      "id": "BAAI/bge-reranker-v2-gemma",
      "stats": {
        "queue_fraction": 0,
        "queue_absolute": 0,
        "results_pending": 0,
        "batch_size": 32
      },
      "object": "model",
      "owned_by": "infinity",
      "created": 1723476182,
      "backend": "torch",
      "capabilities": [
        "embed"
      ]
    }

Open source status

Provide useful links for the implementation

No response

greptile-apps[bot] commented 1 month ago

To resolve the issue with the model BAAI/bge-reranker-v2-gemma having embedding capabilities but no rerank, follow these steps:

  1. Locate Model Definition: Open models.py and find the class definition for BgeRerankerV2Gemma.

  2. Add Rerank Capability: Ensure the class includes a method for reranking. For example:

class BgeRerankerV2Gemma(BaseModel):
    def __init__(self, ...):
        super().__init__(...)
        # existing initialization code

    def rerank(self, inputs):
        # Add reranking logic here
        pass
  1. Update Capabilities: In the same file, update the capabilities list to include rerank.
self.capabilities = ["embed", "rerank"]
  1. Ensure Backend Support: Verify that the backend specified (torch) supports the reranking operation. This might involve checking or updating backend.py.

  2. Configuration: Ensure the model configuration file (likely config.json or similar) reflects the new capabilities.

{
  "capabilities": ["embed", "rerank"]
}

These steps should add the rerank capability to the model.

References

/.github/ISSUE_TEMPLATE/new-model-addition.yml /docs/assets/openapi.json /libs/infinity_emb/tests/end_to_end/test_optimum_embedding.py /libs/infinity_emb/infinity_emb/fastapi_schemas/convert.py /README.md /docs/docs/index.md /docs/benchmarks/simple_app.py /docs/docs /libs/infinity_emb/tests/conftest.py /.github/ISSUE_TEMPLATE/bug-report.yml /.github/ISSUE_TEMPLATE /libs/infinity_emb/tests/data/benchmark /docs/docs/python_engine.md /libs/infinity_emb/tests/end_to_end/test_authentication.py /docs/docs/deploy.md /libs/infinity_emb/infinity_emb/engine.py /docs/assets /docs/benchmarks /docs /libs/infinity_emb/tests/data /libs/infinity_emb/infinity_emb/inference/select_model.py /libs/infinity_emb /docs/docs/integrations.md /libs/infinity_emb/Makefile

#### About Greptile This response provides a starting point for your research, not a precise solution. Help us improve! Please leave a ๐Ÿ‘ if this is helpful and ๐Ÿ‘Ž if it is irrelevant. [Ask Greptile](https://app.greptile.com/chat/github/michaelfeil/infinity/main) ยท [Edit Issue Bot Settings](https://app.greptile.com/apps/github)