Closed jakob-keller closed 1 year ago
I can reproduce this on my M1 Mac. I might have some time tomorrow to look into this. We could xfail
those tests for now for Mac.
We really need M1 GitHub runners... apparently, that's on their roadmap.
In the meantime I will see if I can fix the culprits.
In the meantime I will see if I can fix the culprits.
Let me know, if you need any additional context or want me to test something.
@jakob-keller could you do a run with RUST_BACKTRACE=1
and post the backtrace here?
Here's the full thing for one of the tests:
(.venv) stijn@Hephaestos:~/Documents/code/polars/py-polars$ pytest -k test_series_from_pydecimal_and_ints
========================================================================= test session starts =========================================================================
platform darwin -- Python 3.11.0, pytest-7.2.0, pluggy-1.0.0
rootdir: /Users/stijn/Documents/code/polars/py-polars, configfile: pyproject.toml
plugins: hypothesis-6.70.1, xdist-3.2.0, cov-4.0.0
collected 2133 items / 2132 deselected / 1 selected
tests/unit/datatypes/test_decimal.py F [100%]
============================================================================== FAILURES ===============================================================================
_________________________________________________________________ test_series_from_pydecimal_and_ints _________________________________________________________________
def test_series_from_pydecimal_and_ints() -> None:
# TODO: check what happens if there are strings, floats arrow scalars in the list
for data in permutations_int_dec_none():
s = pl.Series("name", data)
assert s.dtype == pl.Decimal(None, 7) # inferred scale = 7, precision = None
assert s.name == "name"
assert s.null_count() == 1
for i, d in enumerate(data):
> assert s[i] == d
tests/unit/datatypes/test_decimal.py:33:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = shape: (5,)
Series: 'name' [decimal[7]]
[
-0.01
1.2345678
500
-1
null
], item = 0
def __getitem__(
self,
item: (
int | Series | range | slice | np.ndarray[Any, Any] | list[int] | list[bool]
),
) -> Any:
if isinstance(item, Series) and item.dtype in {
UInt8,
UInt16,
UInt32,
UInt64,
Int8,
Int16,
Int32,
Int64,
}:
# Unsigned or signed Series (ordered from fastest to slowest).
# - pl.UInt32 (polars) or pl.UInt64 (polars_u64_idx) Series indexes.
# - Other unsigned Series indexes are converted to pl.UInt32 (polars)
# or pl.UInt64 (polars_u64_idx).
# - Signed Series indexes are converted pl.UInt32 (polars) or
# pl.UInt64 (polars_u64_idx) after negative indexes are converted
# to absolute indexes.
return self._from_pyseries(
self._s.take_with_series(self._pos_idxs(item)._s)
)
elif (
_check_for_numpy(item)
and isinstance(item, np.ndarray)
and item.dtype.kind in ("i", "u")
):
if item.ndim != 1:
raise ValueError("Only a 1D-Numpy array is supported as index.")
# Unsigned or signed Numpy array (ordered from fastest to slowest).
# - np.uint32 (polars) or np.uint64 (polars_u64_idx) numpy array
# indexes.
# - Other unsigned numpy array indexes are converted to pl.UInt32
# (polars) or pl.UInt64 (polars_u64_idx).
# - Signed numpy array indexes are converted pl.UInt32 (polars) or
# pl.UInt64 (polars_u64_idx) after negative indexes are converted
# to absolute indexes.
return self._from_pyseries(
self._s.take_with_series(self._pos_idxs(item)._s)
)
# Integer.
elif isinstance(item, int):
if item < 0:
item = self.len() + item
> return self._s.get_idx(item)
E pyo3_runtime.PanicException: misaligned pointer dereference: address must be a multiple of 0x10 but is 0x16ce57b38
polars/series/series.py:850: PanicException
------------------------------------------------------------------------ Captured stderr call -------------------------------------------------------------------------
thread '<unnamed>' panicked at 'misaligned pointer dereference: address must be a multiple of 0x10 but is 0x16ce57b38', src/conversion.rs:191:9
stack backtrace:
0: rust_begin_unwind
at /rustc/0599b6b931816ab46ab79072189075f543931cbd/library/std/src/panicking.rs:577:5
1: core::panicking::panic_fmt
at /rustc/0599b6b931816ab46ab79072189075f543931cbd/library/core/src/panicking.rs:67:14
2: core::panicking::panic_misaligned_pointer_dereference
at /rustc/0599b6b931816ab46ab79072189075f543931cbd/library/core/src/panicking.rs:174:5
3: polars::conversion::decimal_to_digits
at ./src/conversion.rs:191:9
4: <polars::conversion::Wrap<polars_core::datatypes::any_value::AnyValue> as pyo3::conversion::IntoPy<pyo3::instance::Py<pyo3::types::any::PyAny>>>::into_py
at ./src/conversion.rs:266:32
5: polars::series::PySeries::get_idx
at ./src/series.rs:430:16
6: polars::series::_::<impl polars::series::PySeries>::__pymethod_get_idx__
at ./src/series.rs:221:1
7: pyo3::impl_::trampoline::cfunction_with_keywords::{{closure}}
at /Users/stijn/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyo3-0.18.2/src/impl_/trampoline.rs:41:35
8: pyo3::impl_::trampoline::trampoline_inner::{{closure}}
at /Users/stijn/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyo3-0.18.2/src/impl_/trampoline.rs:204:54
9: std::panicking::try::do_call
at /rustc/0599b6b931816ab46ab79072189075f543931cbd/library/std/src/panicking.rs:485:40
10: ___rust_try
11: std::panicking::try
at /rustc/0599b6b931816ab46ab79072189075f543931cbd/library/std/src/panicking.rs:449:19
12: std::panic::catch_unwind
at /rustc/0599b6b931816ab46ab79072189075f543931cbd/library/std/src/panic.rs:140:14
13: pyo3::impl_::trampoline::trampoline_inner
at /Users/stijn/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyo3-0.18.2/src/impl_/trampoline.rs:204:9
14: pyo3::impl_::trampoline::cfunction_with_keywords
at /Users/stijn/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyo3-0.18.2/src/impl_/trampoline.rs:41:13
15: polars::series::_::_::__init::__INVENTORY::trampoline
at ./src/series.rs:221:1
16: _method_vectorcall_VARARGS_KEYWORDS
17: _PyObject_Vectorcall
18: __PyEval_EvalFrameDefault
19: __PyEval_Vector
20: _slot_mp_subscript
21: __PyEval_EvalFrameDefault
22: __PyEval_Vector
23: __PyEval_EvalFrameDefault
24: __PyEval_Vector
25: __PyEval_EvalFrameDefault
26: __PyEval_Vector
27: __PyObject_FastCallDictTstate
28: __PyObject_Call_Prepend
29: _slot_tp_call
30: __PyObject_MakeTpCall
31: __PyEval_EvalFrameDefault
32: __PyEval_Vector
33: __PyEval_EvalFrameDefault
34: __PyEval_Vector
35: __PyObject_FastCallDictTstate
36: __PyObject_Call_Prepend
37: _slot_tp_call
38: __PyObject_Call
39: __PyEval_EvalFrameDefault
40: __PyEval_Vector
41: __PyEval_EvalFrameDefault
42: __PyEval_Vector
43: __PyEval_EvalFrameDefault
44: __PyEval_Vector
45: __PyObject_FastCallDictTstate
46: __PyObject_Call_Prepend
47: _slot_tp_call
48: __PyObject_MakeTpCall
49: __PyEval_EvalFrameDefault
50: __PyEval_Vector
51: __PyEval_EvalFrameDefault
52: __PyEval_Vector
53: __PyObject_FastCallDictTstate
54: __PyObject_Call_Prepend
55: _slot_tp_call
56: __PyObject_MakeTpCall
57: __PyEval_EvalFrameDefault
58: __PyEval_Vector
59: __PyEval_EvalFrameDefault
60: __PyEval_Vector
61: __PyObject_FastCallDictTstate
62: __PyObject_Call_Prepend
63: _slot_tp_call
64: __PyObject_MakeTpCall
65: __PyEval_EvalFrameDefault
66: _PyEval_EvalCode
67: __PyRun_SimpleFileObject
68: __PyRun_AnyFileObject
69: _Py_RunMain
70: _pymain_main
71: _Py_BytesMain
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
======================================================================= short test summary info =======================================================================
FAILED tests/unit/datatypes/test_decimal.py::test_series_from_pydecimal_and_ints - pyo3_runtime.PanicException: misaligned pointer dereference: address must be a multiple of 0x10 but is 0x16ce57b38
Thanks @stinodego. Can work with that. Is it only relsated to decimals?
The four tests that fail are indeed all related to Decimals somehow.
Polars version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of Polars.
Issue description
ef2ab8af98eb154dc2f924105d787ad0328a5944 introduced changes that cause my local development environment to fail the test suite with 4 errors.
I believe I set up the environment as described in
CONTRIBUTING.md
. It looks like this:macOS Ventura 13.3 on Apple M2 Pro
CPython 3.10.10
Reproducible example
Output of `make test`
``` py-polars % make test πΉ Building a mixed python/rust project π Found pyo3 bindings with abi3 support for Python β₯ 3.7 π Not using a specific python interpreter Ignoring backports.zoneinfo: markers 'python_version < "3.9" and extra == "timezone"' don't match your environment Ignoring tzdata: markers 'platform_system == "Windows" and extra == "timezone"' don't match your environment Ignoring connectorx: markers 'extra == "connectorx"' don't match your environment Ignoring fsspec: markers 'extra == "fsspec"' don't match your environment Ignoring numpy: markers 'extra == "numpy"' don't match your environment Ignoring xlsx2csv: markers 'extra == "xlsx2csv"' don't match your environment Ignoring xlsxwriter: markers 'extra == "xlsxwriter"' don't match your environment Ignoring pyarrow: markers 'extra == "pandas"' don't match your environment Ignoring pandas: markers 'extra == "pandas"' don't match your environment Ignoring deltalake: markers 'extra == "deltalake"' don't match your environment Ignoring sqlalchemy: markers 'extra == "sqlalchemy"' don't match your environment Ignoring pandas: markers 'extra == "sqlalchemy"' don't match your environment Ignoring polars: markers 'extra == "all"' don't match your environment Ignoring pyarrow: markers 'extra == "pyarrow"' don't match your environment Ignoring matplotlib: markers 'extra == "matplotlib"' don't match your environment Requirement already satisfied: typing_extensions>=4.0.1 in ./.venv/lib/python3.10/site-packages (4.5.0) π» Using `MACOSX_DEPLOYMENT_TARGET=11.0` for aarch64-apple-darwin by default Finished dev [unoptimized + debuginfo] target(s) in 0.40s π¦ Built wheel for abi3 Python β₯ 3.7 to /var/folders/7b/kqrfbrqj563g93dc65hn31kc0000gn/T/.tmpu1sXFK/polars-0.16.17-cp37-abi3-macosx_11_0_arm64.whl π Installed polars-0.16.17 .venv/bin/pytest -n auto --dist worksteal =============================================================================================================================================== test session starts =============================================================================================================================================== platform darwin -- Python 3.10.10, pytest-7.2.0, pluggy-1.0.0 rootdir: /Users/xxx/PycharmProjects/polars/py-polars, configfile: pyproject.toml plugins: hypothesis-6.70.1, xdist-3.2.0, cov-4.0.0 gw0 [2042] / gw1 [2042] / gw2 [2042] / gw3 [2042] / gw4 [2042] / gw5 [2042] / gw6 [2042] / gw7 [2042] / gw8 [2042] / gw9 [2042] ........................................................................................................................................................................................................................................................................................................... [ 14%] ...................................................................................................................F....................................................................................................................................................................................... [ 29%] ........................................................................................................................................................................................................................................................................................................... [ 43%] .............................F......................................................s...................................................................................................................................................................................................................... [ 58%] ............................................................................................................................................................................................................................................................................................................ [ 73%] ..................................................................................................................................................................F............................................................................................................F........................... [ 87%] ....................................................................................................................................................................................................................................................... [100%] ==================================================================================================================================================== FAILURES ===================================================================================================================================================== ______________________________________________________________________________________________________________________________________ test_init_dataclasses_and_namedtuple _______________________________________________________________________________________________________________________________________ [gw0] darwin -- Python 3.10.10 /Users/xxx/PycharmProjects/polars/py-polars/.venv/bin/python monkeypatch = <_pytest.monkeypatch.MonkeyPatch object at 0x132503d00> def test_init_dataclasses_and_namedtuple(monkeypatch: Any) -> None: from dataclasses import dataclass from typing import NamedTuple monkeypatch.setenv("POLARS_ACTIVATE_DECIMAL", "1") from polars.utils._construction import dataclass_type_hints @dataclass class TradeDC: timestamp: datetime ticker: str price: Decimal size: int | None = None class TradeNT(NamedTuple): timestamp: datetime ticker: str price: Decimal size: int | None = None raw_data = [ (datetime(2022, 9, 8, 14, 30, 45), "AAPL", Decimal("157.5"), 125), (datetime(2022, 9, 9, 10, 15, 12), "FLSY", Decimal("10.0"), 1500), (datetime(2022, 9, 7, 15, 30), "MU", Decimal("55.5"), 400), ] for TradeClass in (TradeDC, TradeNT): trades = [TradeClass(*values) for values in raw_data] for DF in (pl.DataFrame, pl.from_records): df = DF(data=trades) # type: ignore[operator] assert df.schema == { "timestamp": pl.Datetime("us"), "ticker": pl.Utf8, "price": pl.Decimal(None, 1), "size": pl.Int64, } > assert df.rows() == raw_data tests/unit/test_constructors.py:154: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ polars/utils/decorators.py:136: in wrapper return function(*args, **kwargs) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = shape: (3, 4) βββββββββββββββββββββββ¬βββββββββ¬βββββββββββββ¬βββββββ β timestamp β ticker β price β size ... β 1500 β β 2022-09-07 15:30:00 β MU β 55.5 β 400 β βββββββββββββββββββββββ΄βββββββββ΄βββββββββββββ΄βββββββ, named = False @deprecate_nonkeyword_arguments() def rows(self, named: bool = False) -> list[tuple[Any, ...]] | list[dict[str, Any]]: """ Returns all data in the DataFrame as a list of rows of python-native values. Parameters ---------- named Return dictionaries instead of tuples. The dictionaries are a mapping of column name to row value. This is more expensive than returning a regular tuple, but allows for accessing values by column name. Notes ----- If you have ``ns``-precision temporal values you should be aware that python natively only supports up to ``us``-precision; if this matters you should export to a different format. Warnings -------- Row-iteration is not optimal as the underlying data is stored in columnar form; where possible, prefer export via one of the dedicated export/output methods. Returns ------- A list of tuples (default) or dictionaries of row values. Examples -------- >>> df = pl.DataFrame( ... { ... "a": [1, 3, 5], ... "b": [2, 4, 6], ... } ... ) >>> df.rows() [(1, 2), (3, 4), (5, 6)] >>> df.rows(named=True) [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}, {'a': 5, 'b': 6}] See Also -------- iter_rows : Row iterator over frame data (does not materialise all rows). """ if named: # Load these into the local namespace for a minor performance boost dict_, zip_, columns = dict, zip, self.columns return [dict_(zip_(columns, row)) for row in self._df.row_tuples()] else: > return self._df.row_tuples() E pyo3_runtime.PanicException: misaligned pointer dereference: address must be a multiple of 0x10 but is 0x16cf18d38 polars/dataframe/frame.py:7839: PanicException ---------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call ----------------------------------------------------------------------------------------------------------------------------------------------- thread 'Expected behavior
make test
succeedsInstalled versions