runpod-workers / worker-infinity-embedding

MIT License
11 stars 5 forks source link

Rerankers - How to use? #9

Closed axeloh closed 2 weeks ago

axeloh commented 2 weeks ago

Hello,

I have deployed a reranker model on serverless runpod via your worker. In your docs it says it must be used with standard usage, meaning http requests. I try the following:

data = {
    "model": 'BAAI/bge-reranker-large',
    "query": 'What is a Panda',
    "docs": [
        'Donalt trump is a former US president.',
        'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.',
    ],
}

body = {"input": {"method_name": "rerank", "input": data}}
headers = {"Authorization":  ...}

async def main():
    async with httpx.AsyncClient(timeout=60) as client:
        response = await client.post(url=f'{url}/runsync', json=body, headers=headers)
        response.raise_for_status()

But this gives me the error message: {'code': 400, 'message': "the loaded moded cannot fullyfill "embed".options are {'rerank'}."

Do you know what is wrong? Could you perhaps provide an example which showcases the format?

Thanks!

axeloh commented 2 weeks ago

So I figured it out. To answer my own question, the issue was the format of the body. It should just be body = {"input": data}. The previous body was from some other runpod instance I had used in the past.