yummyml / yummy

Apache License 2.0
33 stars 2 forks source link

ParquetSource as a CUSTOM_SOURCE sometimes broke feature registy #14

Open fura95 opened 1 year ago

fura95 commented 1 year ago

I'm created Features as describe in next code:

from datetime import timedelta
from feast import Entity, FeatureView, Field, ValueType, FeatureService, PushSource
from feast.types import Float32, Int64
from yummy import ParquetSource

card_source = ParquetSource(
    name="card_source",
    path="data_parquet/v_aggapp_card_source.parquet",
    timestamp_field="event_timestamp",
    created_timestamp_column="created",
)

credit_source = ParquetSource(
    name="credit_source",
    path="/data_parquet/v_aggapp_credit_hist_source.parquet",
    timestamp_field="event_timestamp",
    created_timestamp_column="created",
)

card_entity = Entity(name="v_aggapp_card_entity", join_keys=["mdm_customer_rk"])
credit_entity = Entity(name="v_aggapp_credit_entity", join_keys=["mdm_customer_rk"])

card_features = FeatureView(
    name="v_aggapp_card_parquet",
    entities=[card_entity],
    ttl=timedelta(weeks=52),
    schema=[
        Field(name="mdm_customer_rk", dtype=Int64),
        Field(name="cnt_mcc_br5_cat4_6", dtype=Int64),
    ],
    source=card_source,
    online=True,
    tags={"test_tag": "cards"}
)

credit_features = FeatureView(
    name="v_aggapp_credit_parquet",
    entities=[credit_entity],
    ttl=timedelta(weeks=52),
    schema=[
        Field(name="mdm_customer_rk", dtype=Int64),
        Field(name="loan_age_mortg_min", dtype=Int64),
        Field(name="delinq_share_30p_ext_lifo", dtype=Int64),
        Field(name="length_ext", dtype=Int64),
        Field(name="max_util_card_act", dtype=Int64),
        Field(name="pmt_delays_1_29_24m_sum_mnth_lifo", dtype=Int64),
    ],
    source=credit_source,
    online=True,
    tags={"test_tag": "credits"}
)

credit_card_activity_v1 = FeatureService(
    name="credit_card_activity_v1", features=[card_features, credit_features]
)

# Defines a way to push data (to be available offline, online or both) into Feast.
card_push_source = PushSource(
    name="v_aggapp_card_push_source",
    batch_source=card_source,
)

credit_push_source = PushSource(
    name="v_aggapp_credit_push_source",
    batch_source=credit_source,
)

When i execute the commands in the following sequence:

feast apply
feast ui

I have next issue:

Traceback (most recent call last):
  File "/home/feast/feast_plugins/yummy_test/env/bin/feast", line 8, in <module>
    sys.exit(cli())
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/feast/cli.py", line 492, in apply_total_command
    apply_total(repo_config, repo, skip_source_validation)
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/feast/usage.py", line 276, in wrapper
    return func(*args, **kwargs)
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/feast/repo_operations.py", line 305, in apply_total
    apply_total_with_repo_instance(
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/feast/repo_operations.py", line 265, in apply_total_with_repo_instance
    registry_diff, infra_diff, new_infra = store.plan(repo)
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/feast/usage.py", line 287, in wrapper
    raise exc.with_traceback(traceback)
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/feast/usage.py", line 276, in wrapper
    return func(*args, **kwargs)
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/feast/feature_store.py", line 701, in plan
    registry_diff = diff_between(
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/feast/diff/registry_diff.py", line 249, in diff_between
    ) = extract_objects_for_keep_delete_update_add(
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/feast/diff/registry_diff.py", line 207, in extract_objects_for_keep_delete_update_add
    ] = FeastObjectType.get_objects_from_registry(registry, current_project)
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/feast/registry.py", line 92, in get_objects_from_registry
    FeastObjectType.DATA_SOURCE: registry.list_data_sources(project=project),
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/feast/infra/registry_stores/sql.py", line 380, in list_data_sources
    return self._list_objects(
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/feast/infra/registry_stores/sql.py", line 821, in _list_objects
    return [
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/feast/infra/registry_stores/sql.py", line 822, in <listcomp>
    python_class.from_proto(
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/typeguard/__init__.py", line 1033, in wrapper
    retval = func(*args, **kwargs)
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/feast/data_source.py", line 328, in from_proto
    cls = get_data_source_class_from_type(data_source.data_source_class_type)
  File "/home/feast/feast_plugins/yummy_test/env/lib/python3.9/site-packages/feast/repo_config.py", line 369, in get_data_source_class_from_type
    module_name, config_class_name = data_source_type.rsplit(".", 1)
ValueError: not enough values to unpack (expected 2, got 1)
fura95 commented 1 year ago

The problem is the wrong source decoding from registry proto value (it decode to empty value) According to Traceback the reason is primarily somewhere in the feast library.

qooba commented 1 year ago

@fura95 - I have checked feast ui it runs but does not display and information. No error on backend side only javascript errors. I don't think I can help here.