vaexio / vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
https://vaex.io
MIT License
8.28k stars 589 forks source link

[BUG-REPORT] vaex causes a segmentation fault on windows #2442

Open iisakkirotko opened 2 weeks ago

iisakkirotko commented 2 weeks ago

Description We're seeing a Windows fatal exception: access violation running vaex-core 4.18.1 on windows with Python 3.9, not sure if the issue also affects other Python versions. I mentioned this in https://github.com/vaexio/vaex/issues/2439#issuecomment-2387812418, believing it to be related, but this issue persists with vaex-core 4.18.1, as can be see in this CI run.

Software information

Additional information The full stack trace is

Stack Trace ```bash Thread 0x00001d58 (most recent call first): File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\vaex\hash.py", line 171 in add File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\vaex\cpu.py", line 344 in process File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\vaex\execution.py", line 564 in process_tasks File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\vaex\execution.py", line 500 in process_part File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\vaex\multithreading.py", line 80 in wrapped File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\concurrent\futures\thread.py", line 58 in run File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\concurrent\futures\thread.py", line 83 in _worker File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\threading.py", line 917 in run File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\threading.py", line 980 in _bootstrap_inner File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\threading.py", line 937 in _bootstrap File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\solara\server\patch.py", line 306 in _WidgetContextAwareThread__bootstrap File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\solara\server\patch.py", line 284 in WidgetContextAwareThread__bootstrap Current thread 0x00001cf4 (most recent call first): File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\vaex\hash.py", line 171 in add File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\vaex\cpu.py", line 3[44](https://github.com/widgetti/solara/actions/runs/11249248897/job/31275859331#step:9:45) in process File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\vaex\execution.py", line 564 in process_tasks File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\vaex\execution.py", line 500 in process_part File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\vaex\multithreading.py", line 80 in wrapped File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\concurrent\futures\thread.py", line 58 in run File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\concurrent\futures\thread.py", line 83 in _worker File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\threading.py", line 917 in run File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\threading.py", line 980 in _bootstrap_inner File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\threading.py", line 937 in _bootstrap File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\solara\server\patch.py", line 306 in _WidgetContextAwareThread__bootstrapWindows fatal exception: access violation File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\solara\server\patch.py", line 284 in WidgetContextAwareThread__bootstrap Thread 0x000019ec (most recent call first): File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\threading.py", line 316 in wait File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\threading.py", line 581 in wait File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\threading.py", line 1304 in run File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\threading.py", line 980 in _bootstrap_inner File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\threading.py", line 937 in _bootstrap File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\solara\server\patch.py", line 306 in _WidgetContextAwareThread__bootstrap File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\solara\server\patch.py", line 284 in WidgetContextAwareThread__bootstrap Thread 0x000019fc (most recent call first): File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\threading.py", line 312 in waitD:\a\_temp\73d90dee-0906-4bd0-95c2-c4[46](https://github.com/widgetti/solara/actions/runs/11249248897/job/31275859331#step:9:47)22c2060b.sh: line 3: 311 Segmentation fault pytest tests/unit --doctest-modules --timeout=60 ```
ddelange commented 2 weeks ago

potentially a memory leak specific to windows:

https://stackoverflow.com/questions/64421004/how-to-debug-access-violation-memory-issues-in-python-under-windows#comment113941849_64421004

maartenbreddels commented 1 week ago

I wonder if we should use the same wheel we build for releasing in the testing.

ddelange commented 1 week ago

cibuildwheel has built-in support to run pytest on built wheels right after building them https://cibuildwheel.pypa.io/en/stable/options/#testing

setu4993 commented 1 day ago

@maartenbreddels @ddelange : Sorry to bother you but curious when a fix with this might roll out? Maybe a version that skips Windows temporarily might be easier?