Open ovcharenko opened 8 months ago
This may be related to https://github.com/apache/arrow/issues/38364 as I believe what this is going to do is change it so that instead of s3 being initialised up front, it's going to be initialised in each thread and I can believe that might be a race condition or something.
Found another workaround: call ensure_s3_initialized()
before thread loop.
I've also encountered this issue with pyarrow 15.0.0
Thread 27 "python" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffb93e5640 (LWP 249826)]
0x00007fffca1ecef0 in Aws::Utils::Threading::ReaderWriterLock::LockReader() () from /home/ubuntu/workspace/example/.venv/lib/python3.10/site-packages/pyarrow/libarrow.so.1500
(gdb) bt
#0 0x00007fffca1ecef0 in Aws::Utils::Threading::ReaderWriterLock::LockReader() () from /home/ubuntu/workspace/example/.venv/lib/python3.10/site-packages/pyarrow/libarrow.so.1500
#1 0x00007fffca20f002 in Aws::Config::ConfigAndCredentialsCacheManager::GetConfigProfile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const () from /home/ubuntu/workspace/example/.venv/lib/python3.10/site-packages/pyarrow/libarrow.so.1500
#2 0x00007fffca20f2a3 in Aws::Config::GetCachedConfigProfile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /home/ubuntu/workspace/example/.venv/lib/python3.10/site-packages/pyarrow/libarrow.so.1500
#3 0x00007fffca2118d6 in Aws::Auth::STSAssumeRoleWebIdentityCredentialsProvider::STSAssumeRoleWebIdentityCredentialsProvider() () from /home/ubuntu/workspace/example/.venv/lib/python3.10/site-packages/pyarrow/libarrow.so.1500
#4 0x00007fffca213a06 in Aws::Auth::DefaultAWSCredentialsProviderChain::DefaultAWSCredentialsProviderChain() () from /home/ubuntu/workspace/example/.venv/lib/python3.10/site-packages/pyarrow/libarrow.so.1500
#5 0x00007fffc9a6f559 in arrow::fs::S3Options::ConfigureDefaultCredentials() () from /home/ubuntu/workspace/example/.venv/lib/python3.10/site-packages/pyarrow/libarrow.so.1500
#6 0x00007fffc9a7ecc0 in arrow::fs::S3Options::FromUri(arrow::internal::Uri const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) () from /home/ubuntu/workspace/example/.venv/lib/python3.10/site-packages/pyarrow/libarrow.so.1500
#7 0x00007fffc8da7f73 in arrow::fs::(anonymous namespace)::FileSystemFromUriReal(arrow::internal::Uri const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, arrow::io::IOContext const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) () from /home/ubuntu/workspace/example/.venv/lib/python3.10/site-packages/pyarrow/libarrow.so.1500
#8 0x00007fffc8da8a40 in arrow::fs::FileSystemFromUri(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, arrow::io::IOContext const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) ()
from /home/ubuntu/workspace/example/.venv/lib/python3.10/site-packages/pyarrow/libarrow.so.1500
#9 0x00007fffc8da8df1 in arrow::fs::FileSystemFromUriOrPath(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, arrow::io::IOContext const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) ()
from /home/ubuntu/workspace/example/.venv/lib/python3.10/site-packages/pyarrow/libarrow.so.1500
#10 0x00007fffc8da8e53 in arrow::fs::FileSystemFromUriOrPath(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) ()
from /home/ubuntu/workspace/example/.venv/lib/python3.10/site-packages/pyarrow/libarrow.so.1500
$ pip freeze | grep pyarrow
pyarrow==15.0.0
Has any of you tried with latest PyArrow (17.0.0)?
Has any of you tried with latest PyArrow (17.0.0)?
I may confirm that the problem is fixed in v17.0.0
Describe the bug, including details regarding any error messages, version, and platform.
When running such a simple test in Docker I'm getting
Segmentation fault
with PyArrow v14.0.2. The same test works fine in a versions bellow.Sample test:
To repeat:
docker run --rm -it --init --ulimit core=-1 --mount type=bind,source=/tmp/,target=/tmp/ -v /root/work:/work --entrypoint /bin/bash amd64/python:3.11-slim
Workarounds:
Failure details:
Component(s)
Python