Open pkit opened 2 months ago
@pkit Can you please share the package version(s) for which this issue occurred, and any other configuration you may have passed to the driver/connection?
I failed to reproduce this using adbc-driver-snowflake = 1.1.0
. For me it failed with the stack trace I would have expected it to:
panic: arrow/array: number of columns/fields mismatch
goroutine 38 [running]:
github.com/apache/arrow/go/v17/arrow/array.NewRecord(0x1400018e480, {0x14000e82010, 0x1, 0x160009760?}, 0x2)
/Users/runner/go/pkg/mod/github.com/apache/arrow/go/v17@v17.0.0-20240626234237-6680dcfbef42/arrow/array/record.go:151 +0x198
github.com/apache/arrow/go/v17/arrow/cdata.ImportCRecordBatchWithSchema(0x14000581f80?, 0x1400018e480)
/Users/runner/go/pkg/mod/github.com/apache/arrow/go/v17@v17.0.0-20240626234237-6680dcfbef42/arrow/cdata/interface.go:131 +0x248
github.com/apache/arrow/go/v17/arrow/cdata.(*nativeCRecordBatchReader).next(0x14000a9a340)
/Users/runner/go/pkg/mod/github.com/apache/arrow/go/v17@v17.0.0-20240626234237-6680dcfbef42/arrow/cdata/cdata.go:997 +0x1bc
github.com/apache/arrow/go/v17/arrow/cdata.(*nativeCRecordBatchReader).Next(0x14000a9a340)
/Users/runner/go/pkg/mod/github.com/apache/arrow/go/v17@v17.0.0-20240626234237-6680dcfbef42/arrow/cdata/cdata.go:956 +0x20
github.com/apache/arrow-adbc/go/adbc/driver/snowflake.readRecords({0x161bd3558, 0x140005d0aa0}, {0x108fb3898, 0x14000a9a340}, 0x14000118840)
/Users/runner/work/arrow-adbc/arrow-adbc/adbc/go/adbc/driver/snowflake/bulk_ingestion.go:315 +0x78
github.com/apache/arrow-adbc/go/adbc/driver/snowflake.(*statement).ingestStream.func3()
/Users/runner/work/arrow-adbc/arrow-adbc/adbc/go/adbc/driver/snowflake/bulk_ingestion.go:249 +0x34
golang.org/x/sync/errgroup.(*Group).Go.func1()
/Users/runner/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78 +0x58
created by golang.org/x/sync/errgroup.(*Group).Go in goroutine 17
/Users/runner/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:75 +0x98
Abort trap: 6
$ pip freeze | grep adbc
adbc-driver-manager==1.1.0
adbc-driver-snowflake==1.1.0
I will add a full repro soon. Yes, it involves custom configuration for adbc.snowflake.statement.ingest_*
stuff
I lied, it fails even with no custom config.
snowflake_connector_profile.url
is just a snowflake URL as a string
pytest:
def test_adbc_bug(snowflake_connector_profile):
c = connect(snowflake_connector_profile.url, db_kwargs={
"adbc.snowflake.sql.schema": "PUBLIC",
"adbc.snowflake.sql.db": "TEST1",
})
schema = pa.schema(
fields=[
pa.field("name1", pa.string()),
pa.field("name2", pa.string()),
]
)
data = [
{"name1": "aaa"},
{"name1": "bbb"},
]
reader = pa.RecordBatchReader.from_batches(schema, [pa.RecordBatch.from_pylist(data)])
with c.cursor() as cur:
cur.adbc_ingest("test2", reader, mode="create_append")
Exception:
=================================================================================================== test session starts ===================================================================================================
platform linux -- Python 3.11.9, pytest-8.3.2, pluggy-1.5.0 -- /home/user/adbc_bug/.venv/bin/python
cachedir: .pytest_cache
rootdir: /home/user/adbc_bug
configfile: pyproject.toml
plugins: asyncio-0.24.0, anyio-3.7.1, Faker-28.1.0
asyncio: mode=Mode.STRICT, default_loop_scope=None
collected 1 item
tests/functional/python/test_sf_transform.py::test_adbc_bug Fatal Python error: Aborted
Thread 0x00007f3febe29740 (most recent call first):
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/adbc_driver_manager/dbapi.py", line 937 in adbc_ingest
File "/home/user/adbc_bug/tests/functional/python/test_sf_transform.py", line 148 in test_adbc_bug
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/_pytest/python.py", line 159 in pytest_pyfunc_call
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/pluggy/_callers.py", line 103 in _multicall
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/pluggy/_hooks.py", line 513 in __call__
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/_pytest/python.py", line 1627 in runtest
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/_pytest/runner.py", line 174 in pytest_runtest_call
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/pluggy/_callers.py", line 103 in _multicall
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/pluggy/_hooks.py", line 513 in __call__
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/_pytest/runner.py", line 242 in <lambda>
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/_pytest/runner.py", line 341 in from_call
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/_pytest/runner.py", line 241 in call_and_report
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/_pytest/runner.py", line 132 in runtestprotocol
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/_pytest/runner.py", line 113 in pytest_runtest_protocol
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/pluggy/_callers.py", line 103 in _multicall
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/pluggy/_hooks.py", line 513 in __call__
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/_pytest/main.py", line 362 in pytest_runtestloop
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/pluggy/_callers.py", line 103 in _multicall
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/pluggy/_hooks.py", line 513 in __call__
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/_pytest/main.py", line 337 in _main
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/_pytest/main.py", line 283 in wrap_session
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/_pytest/main.py", line 330 in pytest_cmdline_main
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/pluggy/_callers.py", line 103 in _multicall
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/pluggy/_hooks.py", line 513 in __call__
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/_pytest/config/__init__.py", line 175 in main
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/_pytest/config/__init__.py", line 201 in console_main
File "/home/user/adbc_bug/.venv/lib/python3.11/site-packages/pytest/__main__.py", line 9 in <module>
File "<frozen runpy>", line 88 in _run_code
File "<frozen runpy>", line 198 in _run_module_as_main
Extension modules: numpy._core._multiarray_umath, numpy.linalg._umath_linalg, pyarrow.lib, adbc_driver_manager._lib, pyarrow._compute, pyarrow._acero, pyarrow._fs, pyarrow._csv, pyarrow._json, pyarrow._dataset, pyarrow._dataset_orc, pyarrow._parquet, pyarrow._parquet_encryption, pyarrow._dataset_parquet_encryption, pyarrow._dataset_parquet, adbc_driver_manager._reader, pydantic.typing, pydantic.errors, pydantic.version, pydantic.utils, pydantic.class_validators, pydantic.config, pydantic.color, pydantic.datetime_parse, pydantic.validators, pydantic.networks, pydantic.types, pydantic.json, pydantic.error_wrappers, pydantic.fields, pydantic.parse, pydantic.schema, pydantic.main, pydantic.dataclasses, pydantic.annotated_types, pydantic.decorator, pydantic.env_settings, pydantic.tools, pydantic, clickhouse_connect.driverc.buffer, clickhouse_connect.driverc.dataconv, clickhouse_connect.driverc.npconv, zstandard.backend_c, lz4._version, lz4.frame._frame, pyarrow._azurefs, pyarrow._hdfs, pyarrow._gcsfs, pyarrow._s3fs, psycopg2._psycopg, regex._regex, _cffi_backend, charset_normalizer.md, snowflake.connector.nanoarrow_arrow_iterator (total: 54)
Aborted (core dumped)
What happened?
If schema of
RecordBatchReader
doesn't match the actual batch columns - adbc driver crashes. It's also pretty hard to debug why, because the only lead is "double free or corruption (out)". Needed to run under valgrind to understand what's going on. For some reason it fails ingo
with a proper exception "index out of bounds" but then it's not propagated to the python code.Stack Trace
No response
How can we reproduce the bug?
Environment/Setup
Latest