snowflakedb / snowpark-python

Snowflake Snowpark Python API
Apache License 2.0
272 stars 112 forks source link

SNOW-1707286: [Local testing] `date_add` and `date_sub` functions fail for NULL values #2388

Open tvdboom opened 1 month ago

tvdboom commented 1 month ago

Please answer these questions before submitting your issue. Thanks!

  1. What version of Python are you using?

    Python 3.11.6 (tags/v3.11.6:8b6ee5b, Oct 2 2023, 14:57:12) [MSC v.1935 64 bit (AMD64)]

  2. What operating system and processor architecture are you using?

    Windows-10-10.0.22631-SP0

  3. What are the component versions in the environment (pip freeze)?

    pandas==2.2.2 snowflake-snowpark-python==1.22.1

  4. What did you do?

import pandas as pd
from snowflake.snowpark import Session
from datetime import date
from snowflake.snowpark.functions import date_add

mock_session = Session.builder.config("local_testing", True).create()
test_data = mock_session.create_dataframe(pd.DataFrame({"A": [date(2024, 1, 1), date(2024, 1, 2), None]}))

test_data.withColumn("b", date_add(col("a"), 2)).show()
  1. What did you expect to see?

    No error. Output of live session is:

---------------------------
|"A"         |"B"         |
---------------------------
|2024-01-01  |2024-01-03  |
|2024-01-02  |2024-01-04  |
|NULL        |NULL        |
---------------------------

got:

Traceback (most recent call last):
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\mock\_plan.py", line 447, in handle_function_expression
    result = func(*to_pass_args, row_number=current_row, input_data=input_data)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\mock\_functions.py", line 117, in __call__
    result = self.impl(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\mock\_functions.py", line 1567, in mock_dateadd
    res = datetime_expr.combine(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\pandas\core\series.py", line 3457, in combine
    new_values[i] = func(lv, rv)
                    ^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\mock\_functions.py", line 1568, in <lambda>
    value_expr, lambda date, duration: func(cast(date), duration)
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\mock\_functions.py", line 1526, in add_timedelta
    return date + datetime.timedelta(**{f"{unit}s": float(duration) * scalar})
           ~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for +: 'NoneType' and 'datetime.timedelta'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "C:\repos\hippolib\venv\Lib\site-packages\IPython\core\interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-7-2e397755fd61>", line 1, in <module>
    runfile('C:\\repos\\hippolib\\test.py', wdir='C:\\repos\\hippolib')
  File "C:\Program Files\JetBrains\PyCharm 2023.3.5\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\JetBrains\PyCharm 2023.3.5\plugins\python-ce\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:\repos\hippolib\test.py", line 16, in <module>
    test_data.withColumn("b", date_add(col("a"), 2)).show()
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\_internal\telemetry.py", line 156, in wrap
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\dataframe.py", line 3203, in show
    self._show_string(
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\dataframe.py", line 3321, in _show_string
    result, meta = self._session._conn.get_result_and_metadata(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\mock\_connection.py", line 673, in get_result_and_metadata
    res = execute_mock_plan(plan, plan.expr_to_alias)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\mock\_plan.py", line 623, in execute_mock_plan
    column_series = calculate_expression(
                    ^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\mock\_plan.py", line 1633, in calculate_expression
    return handle_function_expression(exp, input_data, analyzer, expr_to_alias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\mock\_plan.py", line 1635, in calculate_expression
    lhs = calculate_expression(exp.col, input_data, analyzer, expr_to_alias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\mock\_plan.py", line 449, in handle_function_expression
    SnowparkLocalTestingException.raise_from_error(
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\mock\exceptions.py", line 20, in raise_from_error
    raise cls(
snowflake.snowpark.mock.exceptions.SnowparkLocalTestingException: Error executing mocked function 'dateadd'. See error traceback for detailed information.
sfc-gh-sghosh commented 1 week ago

Hello @tvdboom ,

Thanks for raising the issue; we are able to reproduce the issue with local_testing where, whereas it's working fine with regular sessions. we will work on eliminating it.

Regards, Sujan

tvdboom commented 1 week ago

This was fixed in version 1.24