snowflakedb / snowflake-connector-python

Snowflake Connector for Python
https://pypi.python.org/pypi/snowflake-connector-python/
Apache License 2.0
578 stars 467 forks source link

SNOW-751918: Zero year timestamp can't be converted to datetime although being valid in Snowflake #1463

Closed vvkh closed 1 year ago

vvkh commented 1 year ago

Please answer these questions before submitting your issue. Thanks!

  1. What version of Python are you using?

    Python 3.10.10 (main, Feb 8 2023, 05:34:50) [Clang 14.0.0 (clang-1400.0.29.202)]

  2. What operating system and processor architecture are you using?

    macOS-13.1-arm64-arm-64bit

  3. What are the component versions in the environment (pip freeze)?

asn1crypto==1.5.1
certifi==2022.12.7
cffi==1.15.1
charset-normalizer==2.1.1
cryptography==39.0.1
filelock==3.9.0
idna==3.4
numpy==1.24.2
oscrypto==1.3.0
pandas==1.5.3
pyarrow==10.0.1
pycparser==2.21
pycryptodomex==3.17
PyJWT==2.6.0
pyOpenSSL==23.0.0
python-dateutil==2.8.2
pytz==2022.7.1
requests==2.28.2
six==1.16.0
snowflake-connector-python==3.0.0
typing_extensions==4.5.0
urllib3==1.26.14
  1. What did you do?
import snowflake.connector

conn = snowflake.connector.connect(password, **kwargs)
conn.cursor().execute("""
    create table dev.valentin.timestamp_issue (
        ts_field timestamp_ntz
    )
""")
conn.cursor().execute("""
    insert into dev.valentin.timestamp_issue values (
      '0000-01-01 00:00:00.000'
    )
""")
result = conn.cursor().execute("select * from dev.valentin.timestamp_issue")
print(list(result))
  1. What did you expect to see?

    1 row fetched from the database.

  2. Can you set logging to DEBUG and collect the logs?

    2023-03-01 22:46:35,498 - MainThread result_batch.py:495 - _parse() - ERROR - Failed to convert: field TS_FIELD: TIMESTAMP_NTZ::-62167219200.000000000, Error: year 0 is out of range
    Traceback (most recent call last):
    File "/Users/vvkh/Projects/test_snowflake/.venv/lib/python3.10/site-packages/snowflake/connector/result_batch.py", line 490, in _parse
    row_result[idx] = v if c is None or v is None else c(v)
    File "/Users/vvkh/Projects/test_snowflake/.venv/lib/python3.10/site-packages/snowflake/connector/converter.py", line 300, in conv
    return datetime.utcfromtimestamp(microseconds)
    ValueError: year 0 is out of range
vvkh commented 1 year ago

At this point, there is no reliable workaround apart from converting the timestamp to some other type on the database side, which is not convenient if you have many existing queries. Although it's hard to imagine a generic solution that will not cause some other nasty bugs, It would be nice to be able to configure the behaviour for such timestamps. For example, configure treating timestamps as floats or something like that.

vvkh commented 1 year ago

I believe https://github.com/snowflakedb/snowflake-connector-python/issues/3 is somewhat related.

vvkh commented 1 year ago

I wonder if the way to handle this by design is to provide a custom converter_class. Is this part of a stable API that one can rely on?

https://github.com/snowflakedb/snowflake-connector-python/blob/ec95c563ded4694f69e8bde4eb2f010f92681e58/src/snowflake/connector/connection.py#L259

pik94 commented 1 year ago

I see that converter_class methods might be called only for json-type response: https://github.com/snowflakedb/snowflake-connector-python/blob/ec95c563ded4694f69e8bde4eb2f010f92681e58/src/snowflake/connector/result_batch.py#L86 Otherwise, methods from ArrowContext will be used to parse result data.

Do i understand correctly that ArrowContext is always used if a flag CAN_USE_ARROW_RESULT_FORMAT set to True https://github.com/snowflakedb/snowflake-connector-python/blob/ec95c563ded4694f69e8bde4eb2f010f92681e58/src/snowflake/connector/cursor.py#L90-L92

and there is no way to use a custom converter class in that case?

sfc-gh-aalam commented 1 year ago

Hey @vvkh, have you tried writing your own converter class? I think you would have to inherit SnowflakeConverter and have a custom implementation for _TIME_to_python.

vvkh commented 1 year ago

Hey @vvkh, have you tried writing your own converter class? I think you would have to inherit SnowflakeConverter and have a custom implementation for _TIME_to_python.

@sfc-gh-aalam yep, working on it right now. Do you think there must be something that would let you handle such timestamps out of the box, though? Like a parameter for the connector class or a built-in converter.

sfc-gh-aalam commented 1 year ago

based on dicussion in https://github.com/snowflakedb/snowflake-connector-python/issues/3, we will create add an option to return None and log warning when there is a conversion datetime util cannot handle.