Closed ffreemt closed 3 months ago
Love the idea! Not sure how well you can expose a RestAPI on huggingface spaces. I would follow this Guide - effectivley you need to use Gradio and not FastAPI (my guess) https://www.tomsoderlund.com/ai/building-ai-powered-rest-api
I would default to the Python API (example below), then add a RestAPI later
import asyncio
from infinity_emb import AsyncEmbeddingEngine, EngineArgs
engine = AsyncEmbeddingEngine.from_args(EngineArgs(model_name_or_path = "BAAI/bge-small-en-v1.5", engine="torch"))
async def main(sentences = ("Embed this is sentence via Infinity.", "Paris is in France.")):
async with engine: # engine starts with engine.astart()
embeddings, usage = await engine.embed(sentences=sentences)
# engine stops with engine.astop()
# call the function from any async func or from asyncio.run()
asyncio.run(main())
Hi. Thanks for the wonderful project.
Is it possible to directly deploy
infinity
on a hf space?I guess it's possible to do it via
gradio
. But all I need is just embeddings. So I wonder whether I can simply run something likeinfinity_emb --model-name-or-path sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 --port 7860
in a hf space and access the API.I tried to deply
infinity
on a hf space https://huggingface.co/spaces/mikeee/emb384. It seems to be running but I cannot figure out how to make a request to the API. There isn't anything at https://huggingface.co/spaces/mikeee/emb384/docs or https://huggingface.co/spaces/mikeee/emb384:7860/docs.