feast-dev / feast

The Open Source Feature Store for Machine Learning
https://feast.dev
Apache License 2.0
5.62k stars 1k forks source link

Add Chaining Reusable Transformation Functions part of Feature Views #4696

Open Vishnu-Rangiah opened 3 weeks ago

Vishnu-Rangiah commented 3 weeks ago

Is your feature request related to a problem? Please describe. Feature transformations should be comprised of reusable components in the form of python functions which can be built into a feature transformation pipeline.

Describe the solution you'd like

Chaining transformation into a pipeline which can be reused across FVs would reduce transformation code duplication across the feature repo.

Here is an example adapted from Tecton's transformation API: https://docs.tecton.ai/docs/defining-features/feature-views/transformations#a-feature-view-that-calls-a-pyspark-transformation-passing-two-pyspark-transformation-outputs

@transformation(mode="big_query")
def last_balance_time(transactions, max_user_transaction):
    return f"""SELECT t.user_id, t.current_balance, last_t.last_transaction_date as timestamp
                      FROM {transactions} t
                      INNER JOIN {max_user_transaction} last_t
                      ON t.user_id = last_t.user_id AND t.timestamp = last_t.last_transaction_date;"""

@transformation(mode="big_query")
def user_last_transaction_time(transactions):
    return f"""SELECT user_id, MAX(timestamp) AS last_transaction_date
                      FROM {transactions}
                      GROUP BY user_id"""

@feature_view(
    sources=[credit_data_batch],
    entities=[user],
    mode="pipeline", # creates a DAG from re-useable transformation functions
    batch_schedule=timedelta(days=1),
    schema=[Field("user_id", String), Field("timestamp", Timestamp), Field("current_balance", Float64)],
)
def user_last_balance(transactions):
    user_last_transaction_time = user_last_transaction_time(transactions)
    return last_balance_time(transactions, user_last_transaction_time)

Describe alternatives you've considered

Additional context