Add a family of Model objects to feast - Githubissues

feast-dev / feast

The Open Source Feature Store for Machine Learning

https://feast.dev

Apache License 2.0

5.36k stars 954 forks source link

Add a family of Model objects to feast #4288

Open tokoko opened 2 weeks ago

tokoko commented 2 weeks ago

This PR is part of an effort to improve integration of feast with model serving. Also see #4139 and accompanying draft RFC for a wider context.

Feast currently only handles feature retrieval, but stays away from model execution. This PR proposes extending the set of feast objects with a family of Model objects that represent actual machine learning models. There are two primary benefits that this new object is targeting:

Feast models will enable users to explicitly hook up ML models with a feast feature service.
Feast online store can be used as a temporary cache for recently computed model results to avoid unnecessary recomputation.
Data and metadata regarding both feature retrieval and model inference can be logged together.
(Maybe) also utilize this objects for training (?)

There will be two initial variations of models (classes extending Model interface)

PythonModel - a python function invoking model inference that will be stored as a pickle object in the registry.
OipModel - an object describing externally deployed ML model that satisfies Open Inference Protocol spec.

New infer method in FeatureStore class will enable users to call feature retrieval and model computation in a single method. This method will optionally allow the user to specify whether cached values are acceptable.

Example Usage:

store = FeatureStore()

example_feature_service = ...

def example_model(data_input) -> data_output:
    data_input['result'] = data_input['param1'] + 1
    return data_input

example_python_model = PythonModel(
    name = 'example_python_model',
    features = example_feature_service,
    model = example_model
)

example_oip_model = OipModel(
    name = 'example_oip_model',
    features = example_feature_service,
    url = 'http://...',
    model_name = 'example_oip_model',
    model_version = 1
)

store.apply([..,, example_feature_service, example_python_model, example_oip_model])

entity_df = pd.DataFrame(...)

store.infer(
    model='example_python_model', // or example_oip_model
    entities=entity_df,
    log_features=True,
    force_recompute=True,
)

tokoko commented 2 weeks ago

@franciscojavierarceo Your PR prompted me to draw this up. wdyt?

franciscojavierarceo commented 2 weeks ago

Yeah I like this. I think we should use both infer and predict that just aliases to infer. I think that will be intuitive for traditional MLEs working with tabular data.

tokoko commented 2 weeks ago

yup, an alias sounds fine. Another thing that came to my mind, we will probably have to come up with an appropriate abstraction to put this functionality in (behind FeatureStore). I'm thinking of InferenceEngine or something similar with LocalInferenceEngine and RemoteInferenceEngine as (probably only) two implementations. That would enable us to apply security model to these methods as well. (fyi @dmartinol @redhatHameed)

franciscojavierarceo commented 2 weeks ago

Yeah, that makes sense.

We can use KServe's protobuf definition too.

franciscojavierarceo commented 2 weeks ago

FYI @rimolive

franciscojavierarceo commented 2 weeks ago

@HaoXuAI @shuchu

tokoko commented 2 weeks ago

Yeah, that makes sense.

We can use KServe's protobuf definition too.

I'm pretty sure that's the same thing as OIP, it just had a couple of name changes along the way. V2 Inference Protocol, KServe V2 Protocol, Open Inference Protocol... they are all the same.

franciscojavierarceo commented 2 weeks ago

Yup! I just wanted to provide the link to it.

dmartinol commented 2 weeks ago

@tokoko (thanks for sharing!) why don't you also look at the Model Registry capabilities? This component provides a central repository to store and manage models, together with the serving endpoints for each deployment of a model artifact (for this I need to find exact doc references).

In short, instead of abstracting an InferenceEngine, we could abstract a ModelRegistry and provide an initial implementation for this Kubeflow component, WDYT?

BTW: This requirement appears more aligned with an external add-on for handling both feature retrieval and model execution together, rather than a core component expected within the 'feature store'. Wouldn't this introduce too many (potentially unnecessary) dependencies?

@jeremyary @etirelli

tokoko commented 1 week ago

@dmartinol thanks for joining in. (This will probably be lengthy) Let me start from your last point, it's probably crucial that we agree on high-level what we're trying to build here. Yes, this is definitely going out of 'feature store' territory and I'm also a bit uncomfortable about that. Having said that, if we find some utility in extending feast this way that's overlooked by other oss tools out there, we should still proceed. I know we might be committing a terrible case of a feature creep here, happy to be called out if you think we are 😄.

The RFC linked above goes into this, but let me briefly recap here as well... I think there are two general ways to bridge feature retrieval and model execution:

If a user already has a somewhat complex serving architecture in place (I'm thinking of a model mesh here, either kserve or seldon) then the best way to go about it would be to treat feature retrieval as a sort of a transformer/preprocessor "model". The control plane of kserve (or seldon) would take turns to call feast first and model endpoint after that passing retrieved feature values. One idea I floated around in the RFC to support this flow is to modify our http/grpc feature servers to make them OIP-compliant effectively treating our FeatureService objects as models. If you have a feature service 'example_servicein the registry, callinginferendpoint of a feature server for a "model"example_service` would return necessary feature values. Since both kserve and seldon talk OIP, integration should be pretty simple. This flow is probably appropriate when an org has hundreds of models running and relying on kserve/seldon makes sense.
This ticket comes into play for a simpler deployment flow, when you are able to spin up a model endpoint somewhere and also have feast deployment in place, but have no easy way to bridge the gap between the two. feast itself takes on the role of a very simple control plane and tries to abstract away model endpoints from the caller. There are also some seemingly beneficial side-effects, mainly the ability to rely on a feast online store for the storage of precomputed (meaning batch) or cached (written out after online calls with some ttl) results of particular models. Not sure if the same can be achieved with an external add-on, but that's also a possibility.

Regarding model registry, if registry indeed contains info about current deployment endpoints (not just trained artifacts), that might be a good idea, although I'm not sure where would be keep additional information regarding models (whether we want to enable cache or not, permissions and so on) if we don't have individual Model objects in our registry and instead rely on a list of models supplied by the registry IIUC.

franciscojavierarceo commented 1 week ago

This ticket comes into play for a simpler deployment flow, when you are able to spin up a model endpoint somewhere and also have feast deployment in place, but have no easy way to bridge the gap between the two. feast itself takes on the role of a very simple control plane and tries to abstract away model endpoints from the caller. There are also some seemingly beneficial side-effects, mainly the ability to rely on a feast online store for the storage of precomputed (meaning batch) or cached (written out after online calls with some ttl) results of particular models. Not sure if the same can be achieved with an external add-on, but that's also a possibility.

This was one of my goals when drafting the RFC. (1) Creating some standard abstractions that gives guidance to the community and (2) enabling faster retrieval of predictions.

franciscojavierarceo commented 1 week ago

FYI here's the link to the PR @tokoko mentioned: https://github.com/feast-dev/feast/issues/4288

etirelli commented 1 week ago

@tokoko @franciscojavierarceo thank you for the write-up/spec. I have concerns around the overall vision for the project and our strategy to achieve that vision, in light of this RFE. KubeFlow, as the open source MLOps platform, already supports and develops KFP, MR and KServe. There is clearly a gap there for the feature store that Feast can fill. However, in that architecture, Feast would be a service to both KFP and KServe, while MR would be responsible for recording the model metadata necessary to connect serving/inference with the feature store. Adding model metadata and API into the feature store, as proposed here, seems to invert the dependency graph in that architecture and create an overlap with the responsibilities of those other components. I am afraid that if we move forward with this RFC, it will cause confusion for users and make our integration with kubeflow more complicated, creating friction both from a technical as well as an organizational perspective. I would advise that we focus on the integration with these other components in the ecosystem instead.

franciscojavierarceo commented 1 week ago

@etirelli thanks for the feedback here! Agree we shouldn't cause confusion and we should have clean and separate responsibilities from the other components.

I don't want to invert the dependency graph but I do want to enable faster model retrieval and I think there's a way to accomplish both of those things. I'll do more research and get back.

tokoko commented 5 days ago

Maybe the way to go without blurring the lines too much would be to somehow build this as a 3rd party tool (inside feast-dev organization if possible???). Initially that was my plan a while ago, already have a repository named hefeasto internally 😄.

We could base that 3rd party tool on existing FeatureStore interface or even expose some new abstractions if needed (for storing cached values in online store, checking whether cache is stale or not, etc...) Some of those new abstractions might even prove useful while integrating with kserve or seldon.

tokoko commented 5 days ago

To be clear, by better integration with kserve I mean that the flow on kserve side might look instead of like this (get_features_from_feast -> get_model_prediction_from_model_endpoint) more like this (check_if_feast_has_cached_result -(if not)> get_features_from_feast -> get_model_prediction_from_model_endpoint -> update_feast_cache)