fal-ai / fal

⚡ Fastest way to serve open source ML models to millions
https://fal.ai/docs
Apache License 2.0
201 stars 18 forks source link

feat(App): introduce /health endpoint #207

Closed efiop closed 1 month ago

efiop commented 1 month ago

Introducing default /health endpoint that will by default return {} and can be extended by user-provided info using fal.App.health() method. This endpoint could also be used as a default endpoint to try connecting to, to know if the app is running.

isidentical commented 1 month ago

question: why are we introducing this when the setup already handles the same logic? I just thought this'd offer a user configurable health endpoint for arbitrary stuff, not change the existing initialization logic

efiop commented 1 month ago

@isidentical We call setup() as a part of lifespan, which means that the endpoints are not operational until setup() is done and lifespan yields. This is the reason why our /health endpoint doesn't work in the repo I've seen it in.

https://github.com/fal-ai/fal/blob/951a856fafa0aa282506250f3ad0c6829b4d157d/projects/fal/src/fal/app.py#L119

isidentical commented 1 month ago

To me it is intentional. The server shouldn’t process any traffic before setup is finished. Every endpoint is written witn this assumption. The usage of health there seem to be redundant. I’d love if this was just a user controlled function that can be customized with a standard template.

On Mon, May 6, 2024 at 03:00 Ruslan Kuprieiev @.***> wrote:

@isidentical https://github.com/isidentical We call setup() as a part of lifespan, which means that the endpoints are not operational until setup() is done and lifespan yields. This is the reason why our /health endpoint doesn't work in the repo I've seen it in.

https://github.com/fal-ai/fal/blob/951a856fafa0aa282506250f3ad0c6829b4d157d/projects/fal/src/fal/app.py#L119

— Reply to this email directly, view it on GitHub https://github.com/fal-ai/fal/pull/207#issuecomment-2095004804, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJKHQKO2FET2JONQIAYSXLZA3BZFAVCNFSM6AAAAABHID7EMKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJVGAYDIOBQGQ . You are receiving this because you were mentioned.Message ID: @.***>

jfischoff commented 1 month ago

From my perspective I think we just need some way for the client to know the server isn’t ready. Returning a specific error code would be fine, or not listening on the port until after setup would also work for me

On Sun, May 5, 2024 at 5:13 PM Batuhan Taskaya @.***> wrote:

To me it is intentional. The server shouldn’t process any traffic before setup is finished. Every endpoint is written witn this assumption. The usage of health there seem to be redundant. I’d love if this was just a user controlled function that can be customized with a standard template.

On Mon, May 6, 2024 at 03:00 Ruslan Kuprieiev @.***> wrote:

@isidentical https://github.com/isidentical We call setup() as a part of lifespan, which means that the endpoints are not operational until setup() is done and lifespan yields. This is the reason why our /health endpoint doesn't work in the repo I've seen it in.

https://github.com/fal-ai/fal/blob/951a856fafa0aa282506250f3ad0c6829b4d157d/projects/fal/src/fal/app.py#L119

— Reply to this email directly, view it on GitHub https://github.com/fal-ai/fal/pull/207#issuecomment-2095004804, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALJKHQKO2FET2JONQIAYSXLZA3BZFAVCNFSM6AAAAABHID7EMKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJVGAYDIOBQGQ>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/fal-ai/fal/pull/207#issuecomment-2095011424, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJEOZBWHUX3AT5CDFC2ELZA3DKTAVCNFSM6AAAAABHID7EMKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJVGAYTCNBSGQ . You are receiving this because your review was requested.Message ID: @.***>

efiop commented 1 month ago

But you already get 500 (Internal Server Error) as of today, until setup() is done and endpoints are ready. I was following @jfischoff 's attempt at /health implementation that also returns {"ready": false} until setup is done and then {"ready": true}, same as this PR.

The only traffic that we get here now is until middleware bounces you off with 503 if we are not ready, but we can set up the endpoints dynamically (including /health, but that means ready is meaningless, since if endpoint is available - we are already ready) after the setup is done.

efiop commented 1 month ago

Though it is also important to note that no matter what an endpoint can't become instantly available to treat us with ready: false messages, so user code will have to handle another error before that (IIRC "application was not found"). So indeed ready: false might be redundant, and we could totally get rid of it here and simplify.

efiop commented 1 month ago

Changed the PR to only provide /health using fal.App.health().

efiop commented 1 month ago

I'll use this for testing for now.

chamini2 commented 1 month ago

From my perspective I think we just need some way for the client to know the server isn’t ready. Returning a specific error code would be fine, or not listening on the port until after setup would also work for me

This is already the behaviour, no? I don't see why we need this

jfischoff commented 1 month ago

Is it? What is the error code that is returned for not ready?

On Mon, May 6, 2024 at 7:42 AM Matteo Ferrando @.***> wrote:

From my perspective I think we just need some way for the client to know the server isn’t ready. Returning a specific error code would be fine, or not listening on the port until after setup would also work for me

This is already the behaviour, no? I don't see why we need this

— Reply to this email directly, view it on GitHub https://github.com/fal-ai/fal/pull/207#issuecomment-2096201032, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJEOYMBXOJRQCUBEADXSLZA6JDVAVCNFSM6AAAAABHID7EMKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJWGIYDCMBTGI . You are receiving this because you were mentioned.Message ID: @.***>

efiop commented 1 month ago

@jfischoff Until the endpoint is working, you'll be getting 500: Error while forwarding the request (I don't know specifically the line of code in our backend where this is coming from, but still).

Well, with /health there is at least now a default endpoint to use and check for 200, so this is easier than using the endpoints themselves and having to deal with things like "method not allowed".

chamini2 commented 1 month ago

but we really do not use a real endpoint at all, we check if a sever is up at all or not @efiop

efiop commented 1 month ago

@chamini2 Sounds like I might be misunderstanding something. Are you talking about it from the perspective of how we check that on the backend?

chamini2 commented 1 month ago

@chamini2 Sounds like I might be misunderstanding something. Are you talking about it from the perspective of how we check that on the backend?

yeah, I am talking about that. is this not what you refer to?

efiop commented 1 month ago

Oh, I mean that /health could be (and should be) used by users, not our backend.

chamini2 commented 1 month ago

got it!