snowflakedb / snowflake-ml-python

Apache License 2.0
38 stars 8 forks source link

send_api_usage_telemetry decorator fails and hides public API errors #33

Closed monai closed 7 months ago

monai commented 1 year ago

I'm following the Snowpark ML tutorial and one of the code examples throws an error. The error is handled by a telemetry collection code which throws an error itself. So I see only an error from the telemetry code and don't see the original error at all.

import snowflake.ml.modeling.preprocessing as snowml

<...>

snowml_oe = snowml.OrdinalEncoder(
    input_cols=["CUT", "CLARITY"],
    output_cols=["CUT_OE", "CLARITY_OE"],
    categories=categories,
)
ord_encoded_diamonds_df = snowml_oe.fit(normalized_diamonds_df).transform(
    normalized_diamonds_df
)

And the error is:

Traceback (most recent call last):
  File "/Users/juozas/projects/lt/spml/transformations.py", line 90, in <module>
    ord_encoded_diamonds_df = snowml_oe.fit(normalized_diamonds_df).transform(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/snowpark/lib/python3.10/site-packages/snowflake/ml/_internal/telemetry.py", line 302, in wrap
    res = func(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/snowpark/lib/python3.10/site-packages/snowflake/ml/_internal/telemetry.py", line 369, in wrap
    res = func(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/snowpark/lib/python3.10/site-packages/snowflake/ml/modeling/preprocessing/ordinal_encoder.py", line 419, in transform
    output_df = self._transform_snowpark(dataset)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/snowpark/lib/python3.10/site-packages/snowflake/ml/modeling/preprocessing/ordinal_encoder.py", line 444, in _transform_snowpark
    if dataset._session._table_exists(self._vocab_table_name)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/snowpark/lib/python3.10/site-packages/snowflake/snowpark/session.py", line 2101, in _table_exists
    raise SnowparkClientExceptionMessages.GENERAL_INVALID_OBJECT_NAME(
snowflake.snowpark.exceptions.SnowparkInvalidObjectNameException: (1500): The object name 's.n.o.w.m.l._.p.r.e.p.r.o.c.e.s.s.i.n.g._.o.r.d.i.n.a.l._.e.n.c.o.d.e.r._.t.e.m.p._.t.a.b.l.e._.c.4.9.9.c.5.f.6.c.1.6.5.4.e.a.4.9.d.e.e.4.0.1.6.5.b.1.5.8.a.9.5' is invalid.

After patching the decorator in snowflake/ml/_internal/telemetry.py

def send_api_usage_telemetry(...)
    def decorator(func: Callable[_Args, _ReturnValue]) -> Callable[_Args, _ReturnValue]:
        return func()

    return decorator

I can see the actual error:

Traceback (most recent call last):
  File "/Users/juozas/projects/lt/spml/transformations.py", line 42, in <module>
    import snowflake.ml.modeling.preprocessing as snowml
  File "/opt/homebrew/Caskroom/miniconda/base/envs/snowpark/lib/python3.10/site-packages/snowflake/ml/modeling/preprocessing/__init__.py", line 7, in <module>
    exportable_classes = init_utils.fetch_classes_from_modules_in_pkg_dir(pkg_dir=pkg_dir, pkg_name=pkg_name)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/snowpark/lib/python3.10/site-packages/snowflake/ml/_internal/init_utils.py", line 25, in fetch_classes_from_modules_in_pkg_dir
    module = importlib.import_module(f"{pkg_name}.{module_info.name}")
  File "/opt/homebrew/Caskroom/miniconda/base/envs/snowpark/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/snowpark/lib/python3.10/site-packages/snowflake/ml/modeling/preprocessing/binarizer.py", line 12, in <module>
    from snowflake.ml.modeling.framework import base
  File "/opt/homebrew/Caskroom/miniconda/base/envs/snowpark/lib/python3.10/site-packages/snowflake/ml/modeling/framework/base.py", line 291, in <module>
    class BaseEstimator(Base):
  File "/opt/homebrew/Caskroom/miniconda/base/envs/snowpark/lib/python3.10/site-packages/snowflake/ml/modeling/framework/base.py", line 373, in BaseEstimator
    def _compute(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/snowpark/lib/python3.10/site-packages/snowflake/ml/_internal/telemetry.py", line 250, in decorator
    return func()
TypeError: BaseEstimator._compute() missing 4 required positional arguments: 'self', 'dataset', 'cols', and 'states'

I expect to see the errors from the public API code and the telemetry not to interfere with it.

Also, I would like to disable telemetry collection. The Snowpark Session option doesn't work, or at least setting it to False doesn't disable the failing code path.

sfc-gh-hayu commented 8 months ago

Disabling telemetry is available from 1.0.7. We are working on a clear traceback.

sfc-gh-hayu commented 8 months ago

This issue will be fixed in 1.1.2. btw the initial error traceback looks correct. SnowparkInvalidObjectNameException is the cause. You got a different error after patching the decorator because you didn't pass args at: return func().