Open AmirLavasani opened 9 months ago
Hello @AmirLavasani jan, thanks for this awesome feature request/ideation.
We're currently working on adding Hezar to the Hugging Face Hub Inference API so that users can easily interact with our models through HTTP requests which comes free for us. It also enables our models to be directly inferenced on the Hub. You can find out more in #56.
The solution you proposed is also desirable. The only issue here might be that we'll need access to servers to host our services.
One idea that we had a while ago, was to provide some plug-and-play Docker images for different tasks that serves the same purpose that you proposed. The reason to do so is that some models are not suitable for use in real life out of the box. For example our OCR models only work on already detected text box images and you cannot use them for an image that has multiple text segments inside. So the solution is to make use of two models in a pipeline to extract all texts correctly (A CTPN + CRNN pipeline for instance), hence needing a separate end-to-end service to do so. Let me know what you think <3 .
Hi @arxyzan jan, Thanks. Let's break down the points.
Thanks once more, and great job on this project!
Thanks for the follow up @AmirLavasani jan.
Now I see. I think we would need more discussion in this regard. My ideation for now is that we must add an api
module that can handle serving any model and make it ready for inference. We've all had such experience for ML models and I don't think it would be a hard thing to do. The important part here is that, as you mentioned, we need some design decisions for this section.
Some additional notes on your points:
hezar-apps
is more convenient. We can discuss this later.Unfortunately, I won't be available until next weekend. For now you can message me on Telegram (@arxyzan) or email and we'll set a meeting for later.
Thanks for your contribution Amir jan.
You're welcome @arxyzan jan! 😊
I completely agree; adding an api module
, following FastAPI structure guidelines, should be a great starting point.
Regarding the HF API, I'll take a look at HF Inference API-compatible outputs.
The concept of "hezar-apps" is intriguing. We can certainly delve into this idea further.
Next weekend works for me, we'll schedule a meeting at your convenience.
@AmirLavasani Nice. Hit me up and we'll talk.😉
I don't know the status of this implementation but I have created a repo that contains the overall structure for wrapping Hezar around FastAPI. https://github.com/rezashabrang/hezar-api
Currently it only support NLP domain and the models that only require text.
I wanted to make the Docker image on github container registry but there are limitations due to insane size of Hezar packages. I'll look into it in future as how we can reduce the overall image size (Maybe using specifc domains like hezar[nlp].
This is just the starting point and we need to add other routers depending on the domain: e.g for computer vision we need to be able to accept images via router for generating captions or accepting sound file for speech recognition domain.
Hi @rezashabrang, thanks for the effort man. I'm sorry to see this so late, my GH notifs were bugged for a while and I missed some of the issues and mentions. I just saw the repo and I think it's pretty solid. If you're still willing to help us out on this task just let me know.
Hey @arxyzan! No problem and I'm still on board for this implementation. You can also state your ideas on how to integrate this into hezar (e.g: separate repo) or any opinion for the API itself and I'd be happy to contribute.
@rezashabrang So glad to have you with us.
As you might have noticed, Hezar models prediction follows the same pipeline for any task meaning that all models (independant of the task) take an input or a batch of inputs and output a list of results (which is dependant on the task). So one main challenge is to implement the service in a way that the same flexibility is kept there too. I know that this might have some overheads for now, so my best solution is to implement one POST route for each of the tasks (6-7 tasks for models and also word embeddings, we can add preprocessors later too) so that each one has its own input/output request schema.
Regarding the second question, I think we can easily add an api
or serve
module in the root module (hezar.serve
for example) and put everything there.
Unfortunately, I'm not familiar with FastAPI design patterns and best practices so I'd be glad to leave this to you.
Let me know what you think.
Issue Description: Let's enhance Hezar by adding a RESTful API. This will allow users to easily access and utilize Hezar's AI models via HTTP endpoints.
Objective: The goal of this enhancement is to provide a user-friendly and standardized way for the community to interact with Hezar's AI models. By exposing these models through a RESTful API, users can leverage AI capabilities with ease, making them accessible to a broader audience.
Proposed Implementation: I suggest implementing a RESTful API (OpenAPI Specification) using FastAPI. This API will expose endpoints for various AI models and functions offered by Hezar.
If the community finds this feature useful, I'm willing to work on its implementation.