Add RESTful API Using FastAPI

AmirLavasani commented 9 months ago

Issue Description: Let's enhance Hezar by adding a RESTful API. This will allow users to easily access and utilize Hezar's AI models via HTTP endpoints.

Objective: The goal of this enhancement is to provide a user-friendly and standardized way for the community to interact with Hezar's AI models. By exposing these models through a RESTful API, users can leverage AI capabilities with ease, making them accessible to a broader audience.

Proposed Implementation: I suggest implementing a RESTful API (OpenAPI Specification) using FastAPI. This API will expose endpoints for various AI models and functions offered by Hezar.

If the community finds this feature useful, I'm willing to work on its implementation.

arxyzan commented 8 months ago

Hello @AmirLavasani jan, thanks for this awesome feature request/ideation.

We're currently working on adding Hezar to the Hugging Face Hub Inference API so that users can easily interact with our models through HTTP requests which comes free for us. It also enables our models to be directly inferenced on the Hub. You can find out more in #56.

The solution you proposed is also desirable. The only issue here might be that we'll need access to servers to host our services.

One idea that we had a while ago, was to provide some plug-and-play Docker images for different tasks that serves the same purpose that you proposed. The reason to do so is that some models are not suitable for use in real life out of the box. For example our OCR models only work on already detected text box images and you cannot use them for an image that has multiple text segments inside. So the solution is to make use of two models in a pipeline to extract all texts correctly (A CTPN + CRNN pipeline for instance), hence needing a separate end-to-end service to do so. Let me know what you think <3 .

AmirLavasani commented 8 months ago

Hi @arxyzan jan, Thanks. Let's break down the points.

I'd like to contribute to adding Hezar to the Hugging Face Hub Inference API. It's a very valuable feature.
To clarify, I wasn't suggesting hosting Hezar services online. I meant creating a local API interface with FastAPI for easier model interactions. Think of it as a web UI interface, similar to the CLI discussed in issue #100. Here is an example of the FastAPI Docs interactive UI. Using this web UI interface, users can make inferences using HTTP requests. The models will run locally.
I think the concept of an end-to-end solution is very powerful. It could serve as an abstract layer for utilizing low-level models in pipelines, such as OCR as you mentioned, or a voice chat with Whisper and LLMs, or YouTube translation using ASR and translation models. Nevertheless, it requires initial design decisions on how to structure these pipelines.
I'll reach out to set up an online meeting to explore this further.

Thanks once more, and great job on this project!

arxyzan commented 8 months ago

Thanks for the follow up @AmirLavasani jan. Now I see. I think we would need more discussion in this regard. My ideation for now is that we must add an api module that can handle serving any model and make it ready for inference. We've all had such experience for ML models and I don't think it would be a hard thing to do. The important part here is that, as you mentioned, we need some design decisions for this section.

Some additional notes on your points:

Adding Hezar to the HF Hub Inference API has a minor blocker which we haven't solved yet. It's not that complicated but we don't want to just fix it. we need a nice and easy schema for converting Hezar model outputs to HF Inference API compatible outputs. You can learn more about this challenge in this issue #56
For end-to-end solutions I think having another repo like hezar-apps is more convenient. We can discuss this later.

Unfortunately, I won't be available until next weekend. For now you can message me on Telegram (@arxyzan) or email and we'll set a meeting for later.

Thanks for your contribution Amir jan.

AmirLavasani commented 8 months ago

You're welcome @arxyzan jan! 😊

I completely agree; adding an api module, following FastAPI structure guidelines, should be a great starting point.

Regarding the HF API, I'll take a look at HF Inference API-compatible outputs.

The concept of "hezar-apps" is intriguing. We can certainly delve into this idea further.

Next weekend works for me, we'll schedule a meeting at your convenience.

arxyzan commented 8 months ago

@AmirLavasani Nice. Hit me up and we'll talk.😉

rezashabrang commented 5 months ago

I don't know the status of this implementation but I have created a repo that contains the overall structure for wrapping Hezar around FastAPI. https://github.com/rezashabrang/hezar-api

Currently it only support NLP domain and the models that only require text.

I wanted to make the Docker image on github container registry but there are limitations due to insane size of Hezar packages. I'll look into it in future as how we can reduce the overall image size (Maybe using specifc domains like hezar[nlp].

This is just the starting point and we need to add other routers depending on the domain: e.g for computer vision we need to be able to accept images via router for generating captions or accepting sound file for speech recognition domain.

arxyzan commented 2 months ago

Hi @rezashabrang, thanks for the effort man. I'm sorry to see this so late, my GH notifs were bugged for a while and I missed some of the issues and mentions. I just saw the repo and I think it's pretty solid. If you're still willing to help us out on this task just let me know.

rezashabrang commented 2 months ago

Hey @arxyzan! No problem and I'm still on board for this implementation. You can also state your ideas on how to integrate this into hezar (e.g: separate repo) or any opinion for the API itself and I'd be happy to contribute.

arxyzan commented 2 months ago

@rezashabrang So glad to have you with us. As you might have noticed, Hezar models prediction follows the same pipeline for any task meaning that all models (independant of the task) take an input or a batch of inputs and output a list of results (which is dependant on the task). So one main challenge is to implement the service in a way that the same flexibility is kept there too. I know that this might have some overheads for now, so my best solution is to implement one POST route for each of the tasks (6-7 tasks for models and also word embeddings, we can add preprocessors later too) so that each one has its own input/output request schema. Regarding the second question, I think we can easily add an api or serve module in the root module (hezar.serve for example) and put everything there. Unfortunately, I'm not familiar with FastAPI design patterns and best practices so I'd be glad to leave this to you. Let me know what you think.

hezarai / hezar

Add RESTful API Using FastAPI #102