Closed semoal closed 4 months ago
Looks like the inference crashes (e.g. CTRL-C event) - how do you run the model?
From a clean installation on your osx, what are the precise steps to replicate this? Are you using the docker container?
I assume, you are stopping / starting the async event loop incorrectly? Are you using astart / astop similar to the server in server.py?
Yes, stopping the event loop while it was generating was causing to not stop until force-clean the terminal. Fixed try catching on the context manager:
@asynccontextmanager
async def lifespan(app: FastAPI):
instrumentator.expose(app)
# Load the ML model
await models.astart()
logger.info(docs.startup_message(host="localhost", port="8080", prefix=""))
try:
yield
except asyncio.exceptions.CancelledError:
pass
# Clean up the ML models and release the resources
await models.ateardown()
Awesome
System Info
MacOS running with torch or optimum on both happens, with small or big batch size. Model:
jinaai/jina-embeddings-v2-base-es
Information
Tasks
Reproduction
I can share privately a repro project where is easy reproducible.
Expected behavior
To not crash