Lightning-AI / LitServe

Lightning-fast serving engine for any AI model of any size. Flexible. Easy. Enterprise-scale.
https://lightning.ai/docs/litserve
Apache License 2.0
2.51k stars 160 forks source link

Callback system v0 #278

Closed aniketmaurya closed 2 months ago

aniketmaurya commented 2 months ago
Before submitting - [ ] Was this discussed/agreed via a Github issue? (no need for typos and docs improvements) - [ ] Did you read the [contributor guideline](https://github.com/Lightning-AI/pytorch-lightning/blob/main/.github/CONTRIBUTING.md), Pull Request section? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests?

⚠️ How does this PR impact the user? ⚠️

Allow users to trigger arbitrary functions via a callback system. This enables to inject useful functionalities like logging inference time, metrics collection, etc.


What does this PR do?

Implements a callback system with certain hooks that get triggered on events.

import litserve as ls

class MetricLogger(ls.Callback):
    def on_before_predict(self, lit_api: "LitAPI"):
        t0 = time.perf_counter()
        self._start_time = t0

    def on_after_predict(self, lit_api: "LitAPI"):
        t1 = time.perf_counter()
        elapsed = t1 - self._start_time
        # print or write to a DB like Grafana
        print(f"Prediction took {elapsed:.2f} seconds")

lit_api = ls.test_examples.SimpleLitAPI()
server = ls.LitServer(lit_api, callbacks=[MetricLogger()])
server.run()

A caveat of this implementation would be that even though the callbacks are classes, it might not be straightforward to save state between to hooks like on_litapi_predict_end and on_server_setup_start because they run in different process and once the callback object is pickled then we don't sync the state.

I think this is okay, considering that when in same context (on_litapi_predict_start and on_litapi_predict_end) on_litapi_*_* we will have the state in same process.


As a result of this PR, we will be able to refactor the monitoring metrics PR as a callback.

from prometheus_client import Summary
from prometheus_client import multiprocess, CollectorRegistry, make_asgi_app

def make_metrics_app():
    generate_metrics_dir()
    registry = CollectorRegistry()
    multiprocess.MultiProcessCollector(registry)
    return make_asgi_app(registry=registry)

class PrometheusCallback(ls.Callback):
    def __init__(self):
        self.summary = Summary("PREDICT_METRIC", "A metric to summarize the predict execution times")

    def on_server_start(self, litserver):
        metrics_app = make_metrics_app()
        litserver.app.mount("/metrics", metrics_app)

    def on_before_predict(self, lit_api: "LitAPI"):
        t0 = time.perf_counter()
        self._start_time = t0

    def on_after_predict(self, lit_api: "LitAPI"):
        t1 = time.perf_counter()
        elapsed = t1 - self._start_time
        self.summary.observe(elapsed)

TODO

PR review

Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in GitHub issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

codecov[bot] commented 2 months ago

Codecov Report

Attention: Patch coverage is 96.19048% with 4 lines in your changes missing coverage. Please review.

Project coverage is 95%. Comparing base (cf78c85) to head (4b39bac). Report is 1 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #278 +/- ## =================================== Coverage 95% 95% =================================== Files 14 18 +4 Lines 1076 1173 +97 =================================== + Hits 1019 1114 +95 - Misses 57 59 +2 ```
williamFalcon commented 2 months ago

@aniketmaurya

for the hook names for the litapi, remove the litapi words:

LitAPI

GOOD

on_setup_start

BAD

on_litapi_setup_start

LitServer

For the litserver, what do these 2 do?