transferwise / pipelinewise-target-snowflake

Singer.io Target for Snowflake - PipelineWise compatible
https://transferwise.github.io/pipelinewise/
Other
51 stars 113 forks source link

Breaking change with numpy 2.0.0 #430

Open isobel-taylor opened 3 months ago

isobel-taylor commented 3 months ago

Describe the bug When version 2.0.0 of numpy dropped, it included a breaking change which causes the loader to crash. The error is ultimately thrown by pandas.

NOTE: This error causes a complete failure.

To Reproduce

  1. Do a clean install using the plugin as the loader.
  2. Run as normal.

Expected behavior The plugin should work.

Screenshots


2024-06-17T00:07:40.703817Z [info     ]   File "/app/src/projects/ea_meltano/mpa_export/.meltano/loaders/target-snowflake/venv/bin/target-snowflake", line 5, in <module> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.704162Z [info     ]     from target_snowflake import main cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.704395Z [info     ]   File "/app/src/projects/ea_meltano/mpa_export/.meltano/loaders/target-snowflake/venv/lib/python3.11/site-packages/target_snowflake/__init__.py", line 18, in <module> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.704583Z [info     ]     from target_snowflake.file_formats import parquet cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.704823Z [info     ]   File "/app/src/projects/ea_meltano/mpa_export/.meltano/loaders/target-snowflake/venv/lib/python3.11/site-packages/target_snowflake/file_formats/parquet.py", line 3, in <module> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.705053Z [info     ]     import pandas              cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.709128Z [info     ]   File "/app/src/projects/ea_meltano/mpa_export/.meltano/loaders/target-snowflake/venv/lib/python3.11/site-packages/pandas/__init__.py", line 22, in <module> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.709460Z [info     ]     from pandas.compat import is_numpy_dev as _is_numpy_dev  # pyright: ignore # noqa:F401 cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.709691Z [info     ]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.709920Z [info     ]   File "/app/src/projects/ea_meltano/mpa_export/.meltano/loaders/target-snowflake/venv/lib/python3.11/site-packages/pandas/compat/__init__.py", line 25, in <module> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.710221Z [info     ]     from pandas.compat.numpy import ( cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.710407Z [info     ]   File "/app/src/projects/ea_meltano/mpa_export/.meltano/loaders/target-snowflake/venv/lib/python3.11/site-packages/pandas/compat/numpy/__init__.py", line 4, in <module> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.710584Z [info     ]     from pandas.util.version import Version cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.710760Z [info     ]   File "/app/src/projects/ea_meltano/mpa_export/.meltano/loaders/target-snowflake/venv/lib/python3.11/site-packages/pandas/util/__init__.py", line 2, in <module> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.710948Z [info     ]     from pandas.util._decorators import (  # noqa:F401 cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.711135Z [info     ]   File "/app/src/projects/ea_meltano/mpa_export/.meltano/loaders/target-snowflake/venv/lib/python3.11/site-packages/pandas/util/_decorators.py", line 14, in <module> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.711330Z [info     ]     from pandas._libs.properties import cache_readonly cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.711512Z [info     ]   File "/app/src/projects/ea_meltano/mpa_export/.meltano/loaders/target-snowflake/venv/lib/python3.11/site-packages/pandas/_libs/__init__.py", line 13, in <module> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.711709Z [info     ]     from pandas._libs.interval import Interval cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.711941Z [info     ]   File "pandas/_libs/interval.pyx", line 1, in init pandas._libs.interval cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2024-06-17T00:07:40.712244Z [info     ] ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake```

**Your environment**
 - Version of target: 2.3.0
 - Version of python: 3.11

**Additional context**
Based on internet searches, I did try to force our outer venv to numpy 2.0.0, as some had issues with previous breaking releases that were resolved by ensuring numpy was running the same version everywhere. However, this didn't remedy the issue. This seems to be a true incompatibility.

We solved this internally by forcing numpy to 1.26.4.
augusthorlen0 commented 3 months ago

I have experienced issues with this tap previously and most of the time, the snowflake-connector-python[pandas] package causes the errors. I've forked it and upgraded the package to the latest version (3.10.1) and it works again.

https://github.com/augusthorlen0/pipelinewise-target-snowflake

sicarul commented 2 months ago

@augusthorlen0 you are a savior 🙌