Open t0k4rt opened 1 year ago
When I have some time I'll check if it's related to the python version built from sources
Am not sure if you had time to test against differnt python versions, but we are experience a similar issue on python 3.9.14
. Specifically we are doing a join on large datasets (>60gb).
Am not sure if you had time to test against differnt python versions, but we are experience a similar issue on
python 3.9.14
. Specifically we are doing a join on large datasets (>60gb).
I'm working on it, i'm building some docker images to test my code with different python version. I'll keep you updated when I've some news !
Similar, selecting 100,000 rows from MS SQL Server, on Ubuntu 22.04.1 (5.15.0-56-generic), Python 3.10.6.
Not running in Docker, but a VMWare VM in this case:
[1191578.055637] show_signal_msg: 22 callbacks suppressed
[1191578.055642] python3[408830]: segfault at 0 ip 00007f75e99e0bca sp 00007ffc0b5e5660 error 6 in connectorx.cpython-310-x86_64-linux-gnu.so[7f75e9694000+1ee8000]
[1191578.055663] Code: 41 56 53 48 83 ec 18 48 89 fb 66 48 8d 3d 46 5e 60 02 66 66 48 e8 26 3b cb ff 48 83 38 00 74 17 48 83 c0 08 48 83 38 00 74 2d <48> ff 0b 74 75 48 83 c4 18 5b 41 5e c3 66 48 8d 3d 19 5e 60 02 66
What language are you using?
Python
What version are you using?
0.3.0
What database are you using?
Postgresql
What dataframe are you using?
Pandas
Can you describe your bug?
I've got a python script running in docker that loads data from sql to a panda dataframe, depending on the volume of data it can load between 15gb and 60gb of data in memory.
This issue is not related to docker memory limits, the script is monitored and fails well below docker container memory limit.
The issue I get is complicated. It mainly fails silently, I need to open dmesg to see some segfault happening.
It seems to happen when data has finished downloading from database
This issue is docker specific, when I run the script on my dev machine (withou docker), everything is going well.
It seems to me there is 2 cases:
First case: my python script use 15gb memory
When data transfer is finished, the python script fails silently and triggers a segfault:
My process then restarts (i've got a restart policy for my failed containers) and in this case, the process does not fail (this happened each times).
Second case: my script use 30gb memory
When data transfer is finished, the python script fails with the error "corrupted double-linked list (not small)" and seems to triggers the same kind of segfault:
My process then restarts (i've got a restart policy for my failed containers) then the process stil fails.
What are the steps to reproduce the behavior?
I cannot reproduce this on my local machine because my local docker instance hits container memory limit and container is shut down.
Host is running on debian 11 (128gb ram/24cores)
Our containers are using latest python 3.8.15 built using pyenv with these specific build flags:
RUN CONFIGURE_OPTS="--enable-shared" PYTHON_CFLAG="-march=haswell -O3 -pipe" pyenv install ${PYTHON_VERSION}
Database setup if the error only happens on specific data or data type
Example query / code
This scripts and query should generate the same kind of data we are using with a high enough volume on a docker container:
test_bug.py
Dockerfile
docker build --pull -t connectorx-bug -f Dockerfile . docker run --env PG_CONN_URL=your_db_conn_url test-dataprocessing-gps-visit
What is the error?
Segfault
And sometimes "corrupted double-linked list (not small)"