grpc / grpc

The C based gRPC (C++, Python, Ruby, Objective-C, PHP, C#)
https://grpc.io
Apache License 2.0
42.04k stars 10.57k forks source link

Random segfaults in grpc python after upgrading to 1.59.x and above #37994

Open poojavp95 opened 1 month ago

poojavp95 commented 1 month ago

What version of gRPC and what language are you using?

grpicio=1.59.0+ , Python 3.10

What operating system (Linux, Windows,...) and version?

Linux ( Ubuntu 20.04.6 LTS (GNU/Linux 5.15.0-1068-aws x86_64))

What runtime / compiler are you using (e.g. python version or version of gcc)

Python 3.10 (Same error with python 3.9.6 as well)

What did you do?

Earlier we were using grpcio=1.48.0 without any issues. Last month we upgraded to grpcio=1.59.0 because of google-ads lib version update and we have started seeing random segfaults happening. Tried using versions 1.59.3, 1.65.5, 1.67.0 etc but all are throwing the same error. Caught a stack trace of one of the segfaults. We cannot downgrade below 1.59.0 as google-ads requires grpcio >= 1.59.0

`

Thread 0x00007fabd6afd640 (most recent call first):
  File "/segment-source/venv/lib/python3.10/site-packages/gevent/_threading.py", line 39 in acquire_with_timeout
  File "/segment-source/venv/lib/python3.10/site-packages/gevent/_threading.py", line 99 in wait
  File "/segment-source/venv/lib/python3.10/site-packages/gevent/_threading.py", line 220 in get
  File "/segment-source/venv/lib/python3.10/site-packages/gevent/threadpool.py", line 195 in run
in run

Thread 0x00007fac111fa640 (most recent call first):
  File "/segment-source/venv/lib/python3.10/site-packages/gevent/_threading.py", line 39 in acquire_with_timeout
  File "/segment-source/venv/lib/python3.10/site-packages/gevent/_threading.py", line 99 in wait
  File "/segment-source/venv/lib/python3.10/site-packages/gevent/_threading.py", line 220 in get
  File "/segment-source/venv/lib/python3.10/site-packages/gevent/threadpool.py", line 195 in run

Current thread 0x00007fac36e19000 (most recent call first):
  File "/segment-source/venv/lib/python3.10/site-packages/grpc/_channel.py", line 1142 in _blocking
  File "/segment-source/venv/lib/python3.10/site-packages/grpc/_channel.py", line 1158 in __call__
  File "/segment-source/venv/lib/python3.10/site-packages/segment_source/client.py", line 44 in set
  File "/segment-source/venv/lib/python3.10/site-packages/integration/resources/base.py", line 129 in set
  File "/segment-source/venv/lib/python3.10/site-packages/integration/resources/ad_performance_reports.py", line 47 in download

Extension modules: greenlet._greenlet, zope.interface._zope_interface_coptimizations, gevent.libev.corecext, gevent._gevent_c_greenlet_primitives, gevent._gevent_c_hub_local, gevent._gevent_c_waiter, gevent._gevent_c_hub_primitives, gevent._gevent_c_ident, gevent._gevent_cgreenlet, gevent._gevent_c_abstract_linkable, gevent._gevent_c_semaphore, gevent._gevent_clocal, gevent._gevent_cevent, gevent._gevent_cqueue, grpc._cython.cygrpc, gevent._gevent_c_imap, google._upb._message, yaml._yaml (total: 18)
{"level":"ERROR","time":"2024-10-23T07:12:13.073198746Z","info":{},"data":{"error":"signal: segmentation fault (core dumped)","job_name":"","program":"source-runner","source":"(*Source).runApplication: /go/src/github.com/segmentio/source-runner/source.go:678","version":"c9243b8"},"message":"error occurred during this sync"}
{"level":"DEBUG","time":"2024-10-23T07:12:13.073287259Z","info":{},"data":{"exit_code":1,"job_name":"","program":"source-runner","version":"c9243b8"},"message":"Stopping...source exit code 1"} 

What did you expect to see?

No segfault

What did you see instead?

Segmentation fault

Make sure you include information that can help us debug (full error message, exception listing, stack trace, logs).

See TROUBLESHOOTING.md for how to diagnose problems better.

Anything else we should know about your project / environment?

sreenithi commented 1 month ago

Hi, thanks for reporting this. Can you provide us with some code to help reproduce the segfaults?

poojavp95 commented 1 month ago

Its hard to isolate the piece of code which causes segfaults since its happening randomly after running our python job for around 1 hour or more. I am attaching requirements.txt and core dump file . Any pointers on how to mitigate the issue will be appreciated. requirements.txt core-dump.txt

gnossen commented 1 month ago

@poojavp95 Unfortunately, there's not enough info from this backtrace for us to work with. Could you please build the gRPC library from source to help us generate those symbols? You'll need to clone the git repo, then run the following from it:

git submodule update --init --recursive
export GRPC_PYTHON_CFLAGS="-g"
GRPC_PYTHON_BUILD_WITH_CYTHON=1 pip install .