feast-dev / feast

The Open Source Feature Store for Machine Learning
https://feast.dev
Apache License 2.0
5.62k stars 1k forks source link

Add Airflow Orchestration Integration for Feature View Transformations #4695

Open Vishnu-Rangiah opened 1 month ago

Vishnu-Rangiah commented 1 month ago

Is your feature request related to a problem? Please describe.

Today users are not able to schedule their feature transformations using simple configuration provided to feast objects.

Describe the solution you'd like The orchestration of feature pipelines should be handled by tools such as Airflow. Schedules can be provided using the @transform decorator.

@transform(
    sources=[credit_data_batch],
    entities=[user],
    mode="python",
    batch_schedule=timedelta(days=1), # Will automatically create a DAG and deploy assets during CICD
    schema=[Field("user_id", String), Field("timestamp", Timestamp), Field("current_balance", Float64)],
)
def user_last_balance(transactions):
    return transactions[["user_id", "timestamp", "current_balance"]]

Ideally the user can define their feature logic and schedule using the feature_view / transform decorator and provide their Airflow instance (local, astronomer, composer, etc.) configuration via the feature_store.yaml. Then the CICD (feast apply) should validate and deploy the correct assets to handle the orchestration of the transformations through airflow.

Describe alternatives you've considered

Additional context