elastic / apm-agent-python

https://www.elastic.co/guide/en/apm/agent/python/current/index.html
BSD 3-Clause "New" or "Revised" License
412 stars 219 forks source link

database trace show twice #695

Closed ackerr closed 4 years ago

ackerr commented 4 years ago

maybe this is not a bug kibana show every database trace will show twice image

but in 4.2.2 version, it show this image

i think its related to span? in 4.2.2 agent won't upload span log, please tell me how to resovle it 😢 (forgive my poor English)

Environment (please complete the following information)

basepi commented 4 years ago

It definitely seems as though the spans are being captured twice. Do you perhaps have any custom instrumentations in place? Or any manual capture_span calls? It's possible that in 5.3.3 we have automatic instrumentation in place that is causing duplicates with any custom/manual spans you're capturing (due to a lack of support in 4.2.2, perhaps).

ackerr commented 4 years ago

we use capture_span to show the rest_framework.serializers.Serializer.to_representatio and celery.Task.apply_asyn , but they all only show once ,when i delete the capture_span code , the database track also twice. And use elasticapm.set_user_context() to trace user authenticate (database trace), other is the default configuration.

capture_span code like this

class CelerySpan(elasticapm.capture_span):
    def __call__(self, func):
        self.name = self.name or get_name_from_func(func)
        self.type = "celery.delay"

        @fn.wraps(func)
        def decorated(obj, *args, **kwds):
            self.name = f"{obj.__name__}.{func.__name__}"
            with self:
                return func(obj, *args, **kwds)
        return decorated
basepi commented 4 years ago

Can you please provide a pip list for us so we can see what you have installed? This will help us narrow down the source of the extra spans.

I assume that only elastic-apm was upgraded? The rest of the dependencies are the same?

ackerr commented 4 years ago

pipenv graph

[packages]
attrs = "==19.3.0"
autobahn = "==20.1.2"
celery = "==4.4.0"
cffi = "==1.13.2"
chinesecalendar = "==1.3.1"
cryptography = "==2.8"
django = "==3.0.2"
django-cors-headers = "==3.2.1"
django-mysql = "==3.3.0"
django-mysql-geventpool = "==0.2.5"
django-s3-storage = "==0.13.0"
djangorestframework = "==3.11.0"
drf-nested-routers = "==0.91"
drf-yasg = "==1.17.0"
elastic-apm = "==5.3.3"
emoji = "==0.5.4"
gevent = "==1.4.0"
hutils = "==0.6.1"
inflection = "==0.3.1"
jsonpickle = "==1.2"
kombu = "==4.6.7"
lupa = "==1.9"
mongoengine = "==0.19.1"
pillow = "==7.0.0"
pyjwt = "==1.7.1"
pymysql = "==0.9.3"
python-dateutil = "==2.8.0"
pytz = "==2019.3"
pywin32 = {version = "==227",sys_platform = "== 'win32'"}
qiniu = "==7.2.6"
qrcode = "==6.1"
redis = "==3.3.11"
rx = "==3.0.1"
sentry-sdk = "==0.14.0"
sqlalchemy = "==1.3.12"
tenacity = "==6.0.0"
wechatpy = "==1.8.6"
xlrd = "==1.2.0"
xlwt = "==1.3.0"
xmltodict = "==0.12.0"

[dev-packages]
black = "==19.10b0"
fabric3 = "==1.14.post1"
factory-boy = ">=2.12.0"
faker = ">=3.0.0"
fakeredis = ">=1.0"
flake8 = ">=3.7.0"
flake8-import-order = ">=0.18"
freezegun = ">=0.3.13"
ipython = ">=7.11.1"
isort = {version = "*",extras = ["pipfile"]}
mongomock = ">=3.17.0"
pook = ">=0.2.8"
pycodestyle = ">=2.4.0"
pyflakes = ">=2.0.0"
pylint = ">=2.2.2"
pylint-django = ">=2.0.2"
pylint-topology = ">=0.1.1"
pytest-cov = ">=2.8.1"
pytest-django = ">=3.8.0"
pytest-xdist = ">=1.31.0"
basepi commented 4 years ago

Nothing super obvious there.

I had another thought. Can you show me the stack trace frames for two of the duplicate spans? Like this:

span

ackerr commented 4 years ago

an example, the duplicate spans all like this

image

image

basepi commented 4 years ago

Unfortunately your screenshots only show one of the spans -- what I'd like to do is compare the frames of two duplicate spans, see if I can spot why it's being instrumented twice. Can you add the duplicate to the span shown above?

ackerr commented 4 years ago

a new api from local server screencapture-span1

this is the deplicate spans screencapture-span2

beniwohli commented 4 years ago

Hi @zzzzzzmj

As @basepi says, this kind of looks like we instrument things twice. That shouldn't happen, as we check before instrumenting. But there could be bugs with those checks of course.

If possible, could you add this code in a view somewhere and see if the assertion fails? Of course only on a test system :)

from urllib.request import AbstractHTTPHandler
assert not hasattr(AbstractHTTPHandler.do_open._self_wrapper, "_self_wrapper")
ackerr commented 4 years ago

I add the code in a view and the request is also 200. @beniwohli

beniwohli commented 4 years ago

Hi @zzzzzzmj

I implemented a possible workaround in #708. You can give that a try by installing it like this:

pip install https://github.com/beniwohli/apm-agent-python/archive/avoid-double-instrumentation.zip\#egg\=elastic-apm\=\=100

Can you try that out on your test machine and see if it fixes the duplicate spans?

ackerr commented 4 years ago

@beniwohli i install the patch and try again,now only once.

beniwohli commented 4 years ago

@zzzzzzmj that's great news, as it confirms that the issue is really due to instrumenting twice. I'm still not quite sure how and why this happens, as we have protections against it, and use threading locks during instrumenting itself.

I'll investigate a bit further to see if I can get to the root cause, before merging #708

ackerr commented 4 years ago

@beniwohli Thanks! Waiting the release