Open fbexiga opened 7 months ago
Thanks for reporting this, @fbexiga. If turning off the Profiling functionality is an option for your use case, it's the first thing I'd recommend. Does the error still occur when you set DD_PROFILING_ENABLED=0
?
cc @sanchda
@fbexiga, thank you so much for the thorough and insightful report. Unfortunately, I don't think we have a short-term workaround, but we'll try to get this resolved promptly.
That's ok, for now we just downgraded back to 3.11.8. No rush or anything, but I thought it was worth reporting.
I tried disabling profiling but still same result.
I have the same error in a Celery application using Python 3.11.9 + gevent + ddtrace
Traceback (most recent call last):
File "src/gevent/_abstract_linkable.py", line 287, in gevent._gevent_c_abstract_linkable.AbstractLinkable._notify_links
File "src/gevent/_abstract_linkable.py", line 333, in gevent._gevent_c_abstract_linkable.AbstractLinkable._notify_links
AssertionError: (None, <callback at 0x7fe8acaaa4c0 args=([],)>)
2024-04-09T21:13:55Z <callback at 0x7fe8acaaa4c0 args=([],)> failed with AssertionError
Also affects Python 3.12.3; used to work just fine with 3.12.2.
Encountered a similar exception, but we don't have profiling enabled. Also happened when moving from 3.11.8
to 3.11.9
. Rolling back python version resolved the error.
Traceback (most recent call last):
File "/usr/local/lib/python3.11/threading.py", line 1002, in _bootstrap
self._bootstrap_inner()
File "/usr/local/lib/python3.11/threading.py", line 1049, in _bootstrap_inner
self._delete()
File "/usr/local/lib/python3.11/threading.py", line 1081, in _delete
del _active[get_ident()]
~~~~~~~^^^^^^^^^^^^^
KeyError: 139737853141056
ddtrace==2.7.6 django==4.2.11 gevent==23.9.1 greenlet==3.0.3 gunicorn==21.2.0
There is no clear link between this issue and https://github.com/DataDog/dd-trace-py/pull/8870, but it might be worth testing it once it's released 🤞 . Meanwhile we'll see if we can reproduce this issue
Was testing this and found that the crash did not happen when we are on an Intel Processor and crashes on AMD EPYC. Disabling ddtrace
prevents it from crashing on AMD EPYC.
Intel processor: Intel(R) Xeon(R) CPU @ 2.20GHz
AMD EPYC processor: AMD EPYC 7B12
Docker image = python:3.11.9-slim
ddtrace==2.8.2
flask=3.0.3
gevent==24.2.1
greenlet=3.0.3
gunicorn==22.0.0
Downgrading to python 3.11.8 stops the crash on AMD EPYC.
Any movement on this?
~Reproducible with Python 3.12.4
+ gevent 24.2.1
+ greenlet 3.0.3
+ ddtrace 2.9.0
.~
~UPD 1: Only reproducible together with sentry-sdk.~
UPD 2: Reproducible without sentry-sdk. It was a red herring.
Also affects Python 3.12.3; used to work just fine with 3.12.2.
I likewise encountered a similar issue when using 3.12.3. Downgrading to 3.12.2 fixed the issue.
I finally have a working reproducer: https://github.com/iherasymenko/ddtrace-8903-reproducer
Chasing it down required a machine with the AMD EPYC 7R13 processor (an AWS EC2 c6a.8xlarge VM) but it seems like the simplified version works fine both on my M3 MacBook Pro and my Intel Core i7 Linux machine.
ddtrace v2.10.0rc2 is still affected by the issue.
Also, in this particular example, disabling patching of mongoengine
via DD_PATCH_MODULES="mongoengine:false"
helps but this is not really an option as the other enabled integrations will cause the similar effect.
I've also been having these issues and noticed a gevent issue showing that it's not compatible with 3.11.9. It further points to a cpython issue about the import of the threading library that happens before gevent has a chance to patch it.
There's a PR open to address this and I've tried the patch locally and I was able to get ddtrace-run & gevent to play nice on 3.11.9 https://github.com/python/cpython/pull/120233
This looks to be an issue strictly with cpython on the latest patch series for 3.11 and 3.12.
EDIT: spelling
Even though https://github.com/python/cpython/pull/120233 is already merged, it looks like it will not be backported to 3.11 as it is not considered a security fix (https://github.com/python/cpython/pull/120233#issuecomment-2207215913).
It is however, ported to 3.12 / 3.13, so it looks like we will need to upgrade unless there is a plan for gevent to update their code.
The issue is fixed in 2.10.0 and 2.9.4 🎉
Summary of problem
When trying to start a Flask API using gunicorn + gevent + ddtrace + Python 3.11.9, the application crashes. However, if I use Python 3.11.8 instead or remove either gevent or ddtrace, it works. Also, I can only reproduce this issue on a Linux system (like Debian Bookworm), not on MacOS for instance.
Edit: it appears that even after downgrading to Py 3.11.8, with ddtrace 2.7.x the application doesn't start properly, although the error is different. With 2.6.x it does work as expected.
Which version of dd-trace-py are you using?
Tested 2.8.0, 2.7.7 and a few more down to 2.6.3
Which version of pip are you using?
Python 3.11.9 pip 24.0
Which libraries and their versions are you using?
ddtrace==2.7.7 flask==3.0.2 gevent==24.2.1 greenlet==3.0.3 gunicorn==21.2.0
How can we reproduce your problem?
If I try to start a Flask API using gunicorn with gevent workers + ddtrace + Python 3.11.9, i get the following error as soon as the worker boots:
What is the result that you get?
I am unable to start the application, getting the error mentioned above.
What is the result that you expected?
I expected the application to start and work just like it does with an older version of Python.