michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks.
https://michaelfeil.eu/infinity/
MIT License
971 stars 72 forks source link

API Key Authentication for Michaelfeil Infinity #207

Closed AjayKarma05 closed 1 month ago

AjayKarma05 commented 2 months ago

Model description

Could you please provide guidance on how to enable API_KEY authentication? Alternatively, is there a plan to implement API_KEY authentication similar to OpenAI's approach?

Open source status

Provide useful links for the implementation

No response

semoal commented 1 month ago

It's just a fastapi server, implement it as usual like you do on FastAPI.

michaelfeil commented 1 month ago

I added an example on how to use infinity with runpod.io

note, that API_Keys are typically not set, they are generated. Also they are not verified by the application itself, but by e.g. AWS API Gateway. I understand ypur interest, but i think such a feature is easy to implement yourself or better be added with another service

michaelfeil commented 1 month ago

https://github.com/monotykamary/infinity/commit/36680ae5435eac2e7db2ec70963ee82121943f10

@monotykamary has implemented it. Changed my mind: If the contribution is easy enough, comes with a unit test that covers every LOC, and defaults to no api key as default behauviour, would be accepted

monotykamary commented 1 month ago

Oh, I just did a dirty implementation to quickly spin up a quick embedding server with auth on Modal: https://github.com/dwarvesf/llm-hosting/blob/main/infinity_snowflake_arctic_embed_l_335m.py

michaelfeil commented 1 month ago

Looks not dirty at all @monotykamary! Awesome

michaelfeil commented 1 month ago

@monotykamary @semoal @AjayKarma05 Added auth, you can set a Bearer token via INFINITY_API_KEY=mykey123 or --api-key mykey123

Jimmy-Newtron commented 2 weeks ago

How to set the API_KEY in the Langchain Infinity Embedding?

michaelfeil commented 2 weeks ago

@Jimmy-Newtron PRs to langchain are welcome. I think this feature is missing there.