feast-dev / feast

The Open Source Feature Store for Machine Learning
https://feast.dev
Apache License 2.0
5.62k stars 1k forks source link

Feast will attempt to create a BigQuery dataset regardless of `table_create_disposition` #4648

Closed danbaron63 closed 1 month ago

danbaron63 commented 1 month ago

Expected Behavior

When table_create_disposition is set to CREATE_NEVER, a BigQuery dataset should not be created.

Current Behavior

table_create_disposition is ignored and feast will attempt to create the dataset regardless. This can be a problem for orgs who require to manage warehouse infrastructure outside of Feast.

Steps to reproduce

Run get_historical_features on any BigQuery offline store. Example code:

from feast import FeatureStore
from pathlib import Path
from datetime import datetime
import pandas as pd

config = Path("feature_store.yaml")
store = FeatureStore(fs_yaml_file=config)
training_df = store.get_historical_features(
    entity_df=pd.DataFrame.from_dict(
        {
            "id": ["<uid>"],
            "feature_timestamp": [datetime(2020, 1, 1)]
        }
    ),
    features=[
        "<feature_view>:<feature>"
    ]
)

print(training_df.to_df())

Specifications

Possible Solution

table_create_disposition should be checked in this method. If the dataset does not exist and exception should be thrown when table_create_disposition is CREATE_NEVER.

danbaron63 commented 1 month ago

Draft PR here https://github.com/feast-dev/feast/pull/4649/files

Still WIP as some tests are failing locally (possibly due to my environment setup though). Will see if I can get them working next week some time!