DataDog / dd-trace-py

Datadog Python APM Client
https://ddtrace.readthedocs.io/
Other
546 stars 411 forks source link

ddtrace.internal.writer:trace buffer(209 traces 8378483b/8388608b) cannot fit trace of size 27505b, dropping #2834

Closed gmlwo530 closed 3 years ago

gmlwo530 commented 3 years ago

Which version of dd-trace-py are you using?

0.52.1

Which version of pip are you using?

21.2.4 (python 3.8)

Which version of the libraries are you using?

You can copy/paste the output of pip freeze here.

How can we reproduce your problem?

I have updated dd-trace-py version from 0.47.0 to 0.52.1 and python from 3.6 to 3.8.

What is the result that you get?

Below, ubuntu syslog. (path: /var/log/syslog)

trace-agent[2732]: 2021-09-06 08:57:58 UTC | TRACE | INFO | (pkg/trace/info/stats.go:101 in LogStats) | No data received

Sep  6 08:58:12 **ip-xxxxxxx** ddtrace-run[30841]: WARNING:ddtrace.internal.writer:trace buffer (209 traces 8378483b/8388608b) cannot fit trace of size 27505b, dropping

Sep  6 08:58:48 **ip-xxxxxxx** process-agent[2731]: 2021-09-06 08:58:48 UTC | PROCESS | INFO | (collector.go:164 in runCheck) | Finish container check #20 in 47.877µs

Sep  6 08:59:03 **ip-xxxxxxxx** ddtrace-run[30841]: WARNING:ddtrace.internal.writer:trace buffer (210 traces 8385997b/8388608b) cannot fit trace of size 17244b, dropping, 16 additional messages skipped

Sep  6 08:59:08 **ip-xxxxxxxx** trace-agent[2732]: 2021-09-06 08:59:08 UTC | TRACE | INFO | (pkg/trace/info/stats.go:101 in LogStats) | No data received

And, my datadog dashboard displayed "No data"

What is the result that you expected?

Work well.

Other

When dd-trace-py' version is downgraded, datadog works well.

Kyle-Verhoog commented 3 years ago

Hi @gmlwo530, thanks for the report.

Just to confirm: you're seeing the issue in 0.52.1 but not in 0.47.0, correct?

If so this is possibly because we fixed an issue which was resulting in incorrect trace metrics but could result in more traces being kept in memory. To work around this you can try setting the DD_TRACE_PARTIAL_FLUSH_ENABLED=true environment variable.

Do you mind sharing which integrations or if you're doing any custom instrumentation so we get an idea of why the traces would be so large? It might be a bug that's leading to really large traces.

gmlwo530 commented 3 years ago

@Kyle-Verhoog

Thanks for reply.

I got issue in 0.52.1. It works well in 0.47.0

I'm using ddtrace package only running uwsgi like below

/<path>/ddtrace-run /<path>/uwsgi --emperor /etc/uwsgi/sites

And, using ddtrace environemnt

DD_SERVICE='dailyshot-admin'
DD_ENV='prod'
DD_LOGS_INJECTION='true'

I will try DD_TRACE_PARTIAL_FLUSH_ENABLED=true setting and share result.

mintusah25 commented 3 years ago

Hey... I am facing this issue with 0.53.0. trace buffer (197 traces 8371335b/8388608b) cannot fit trace of size 109694b, dropping`

& there is no data at the DataDog dashboard even with DD_TRACE_PARTIAL_FLUSH_ENABLED=ture A @gmlwo530 is above env variable worked for you?

gmlwo530 commented 3 years ago

@mintusah25 I have test with DD_TRACE_PARTIAL_FLUSH_ENABLED=true on 0.52.1 and 0.53.0. Unfortunately, i faced same issue.

@Kyle-Verhoog Is there other way to resolve this issue? If you want to know my develop environment, let me know what you need.

Kyle-Verhoog commented 3 years ago

@gmlwo530, @mintusah25 would you be able to provide more information about your applications? pip freeze would help give some insight into the libraries you're using.

If you're able to provide a reproduction of the issue that would be great.

gmlwo530 commented 3 years ago
amqp==5.0.6
appdirs==1.4.4
asgiref==3.2.10
asn1crypto==0.24.0
async-timeout==2.0.1
attrs==21.2.0
autobahn==19.3.2
Automat==20.2.0
autopep8==1.5.5
bcrypt==3.1.6
beautifulsoup4==4.6.0
billiard==3.6.4.0
boto3==1.14.45
botocore==1.17.45
bs4==0.0.1
cachetools==4.0.0
celery==5.0.5
certifi==2017.4.17
cffi==1.12.2
channels==2.4.0
chardet==3.0.3
click==7.1.2
click-didyoumean==0.0.3
click-plugins==1.1.1
click-repl==0.1.6
colorama==0.4.1
constantly==15.1.0
coolsms-python-sdk==2.0.3
coreapi==2.3.3
coreschema==0.0.4
cryptography==3.4.7
curlify==2.2.1
daphne==2.5.0
ddtrace==0.47.0
defusedxml==0.5.0
diff-match-patch==20181111
Django==2.2
django-admin-interface==0.14.2
django-ajax-selects==1.7.1
django-allauth==0.32.0
django-appconf==1.0.3
django-autocomplete-light==3.8.1
django-celery-results==2.0.1
django-colorfield==0.4.1
django-cors-headers==2.4.1
django-crontab==0.7.1
django-debug-toolbar==1.9.1
django-extensions==2.2.9
django-flat-responsive==2.0
django-flat-theme==1.1.4
django-image-cropping==1.4.0
django-imagekit==4.0.1
django-import-export==2.5.0
django-ipware==2.1.0
django-model-utils==3.1.2
django-multiupload==0.5.2
django-querycount==0.7.0
django-recaptcha==1.4.0
django-redis==4.12.1
django-rest-swagger==2.2.0
django-rosetta==0.8.1
django-silk==4.1.0
django-storages==1.6.6
django-tags-input==5.0.0
django-vinaigrette==1.1.1
djangorestframework==3.11.0
docutils==0.15.2
drf-yasg==1.17.0
easy-thumbnails==2.7
entrypoints==0.3
et-xmlfile==1.0.1
facebook-business==5.0.3
flower==0.9.4
freezegun==0.3.15
geographiclib==1.50
geopy==2.0.0
google-api-python-client==1.7.11
google-auth==1.24.0
google-auth-httplib2==0.0.3
google-auth-oauthlib==0.4.2
gprof2dot==2019.11.30
gspread==3.6.0
httplib2==0.15.0
humanize==0.5.1
hyperlink==21.0.0
idna==2.5
importlib-metadata==1.3.0
incremental==21.3.0
inflection==0.3.1
intervaltree==3.1.0
itypes==1.1.0
jdcal==1.4
Jinja2==2.10.3
jmespath==0.9.4
jsonschema==3.2.0
kombu==5.0.2
lxml==4.5.2
MarkupPy==1.14
MarkupSafe==1.1.1
mccabe==0.6.1
microsofttranslator==0.8
more-itertools==8.0.2
msgpack-python==0.5.6
mypy-extensions==0.4.3
mysqlclient==1.4.6
oauth2client==4.1.3
oauthlib==2.0.2
odfpy==1.3.6
olefile==0.44
openapi-codec==1.3.2
openpyxl==3.0.5
packaging==19.2
paramiko==2.7.1
pathspec==0.8.1
pilkit==2.0
Pillow==8.3.1
polib==1.1.0
prompt-toolkit==3.0.18
protobuf==3.17.3
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycodestyle==2.6.0
pycparser==2.19
pyflakes==2.2.0
Pygments==2.6.1
PyHamcrest==1.9.0
PyJWT==1.7.1
PyNaCl==1.3.0
pyOpenSSL==19.1.0
pyparsing==2.4.5
pyrsistent==0.15.6
python-dateutil==2.8.1
python3-openid==3.1.0
pytz==2019.3
PyYAML==3.13
redis==3.5.3
regex==2020.7.14
requests==2.17.3
requests-oauthlib==0.8.0
rsa==4.0
ruamel.yaml==0.16.5
ruamel.yaml.clib==0.2.0
s3transfer==0.3.3
selenium==3.141.0
service-identity==18.1.0
simplejson==3.17.0
six==1.15.0
sortedcontainers==2.4.0
sqlparse==0.3.0
swagger-spec-validator==2.4.3
tablib==2.0.0
tenacity==8.0.1
toml==0.10.1
tornado==5.1.1
twilio==6.35.1
Twisted==21.2.0
txaio==18.8.1
typed-ast==1.4.3
typing-extensions==3.10.0.0
unicodecsv==0.14.1
uritemplate==3.0.1
urllib3==1.21.1
uWSGI==2.0.19.1
vine==5.0.0
wcwidth==0.2.5
Werkzeug==1.0.1
xlrd==1.1.0
xlwt==1.3.0
zipp==0.6.0
zope.interface==5.4.0

Run ddtrace command in uwsgi.service (Server is running on uwsgi using systemd)

Below, my uwsgi.service settings

[Unit]
Description=uWSGI Emperor service

[Service]
Environment="DD_SERVICE='dailyshot-admin'"
Environment="DD_ENV='prod'"
Environment="DD_LOGS_INJECTION='true'"
ExecStartPre=/bin/bash -c 'mkdir -p /run/uwsgi; chown ubuntu:ubuntu /run/uwsgi'
ExecStart=<virtualenv-path>/ddtrace-run <virtualenv-path>/uwsgi --emperor /etc/uwsgi/sites
Restart=always
KillSignal=SIGQUIT
Type=notify
NotifyAccess=all

[Install]
WantedBy=multi-user.target
jalaziz commented 3 years ago

I believe we are facing this issue too. It started with 0.50.0. It's happening with a Django app that is run with uwsgi. We are using the import method of configuring uwsgi. It doesn't seem to be happening with non-Django/uwsgi services.

Kyle-Verhoog commented 3 years ago

Hmm. We do have some known compatibility issues with uwsgi.

@gmlwo530 thanks for the info! Can you try configuring uwsgi using the instructions here: https://ddtrace.readthedocs.io/en/stable/advanced_usage.html#uwsgi

@jalaziz can you double check that your uwsgi settings match the documentation above?

jalaziz commented 3 years ago

https://ddtrace.readthedocs.io/en/stable/advanced_usage.html#uwsgi

We have everything enabled except lazy-apps. Did that requirement change recently? Could've sworn it was not required before. Will test out uwsgi with that option enabled.

jalaziz commented 3 years ago

https://ddtrace.readthedocs.io/en/stable/advanced_usage.html#uwsgi

We have everything enabled except lazy-apps. Did that requirement change recently? Could've sworn it was not required before. Will test out uwsgi with that option enabled.

@Kyle-Verhoog Indeed, lazy-apps did the trick. Do you happen to know what changed with 0.50.0 that requires lazy-apps now? Could it be the context management changes?

gmlwo530 commented 3 years ago

Hmm. We do have some known compatibility issues with uwsgi.

@gmlwo530 thanks for the info! Can you try configuring uwsgi using the instructions here: https://ddtrace.readthedocs.io/en/stable/advanced_usage.html#uwsgi

@Kyle-Verhoog I resolved issue that edit setting file of uwsgi following document which you shared. Thank you for your help. And i'm sorry for not checking the documentation thoroughly.

brettlangdon commented 3 years ago

@jalaziz and @gmlwo530 I can answer that.

Do you happen to know what changed with 0.50.0 that requires lazy-apps now? Could it be the context management changes?

There are 2 things:

  1. We never "required" lazy-apps in the past, but suggested it if you did manual instrumentation or one of a few other edge cases. However, we've realized that it is really hard to actually determine when you needed lazy-apps vs not, which instrumentation needs it vs how do we identify the off edge cases. So we decided to update our docs to be worded more as a requirement to help ease any potential problems people might hit.
    1. For context, the issue is if we start any of our background threads in the uwsgi master process we can cause deadlocks with our locking. This is because uwsgi doesn't use PyOS_Fork, but instead has their own implementation which does not do proper post-fork cleanup of the existing Python process (which includes releasing any locks). So it is mostly a dice roll, but good chance you'll end up with a deadlock and workers will fail to start/restart or we'll have issues with our background threads not starting back up properly (like the issue you are seeing?)
  2. This likely isn't related directly to 0.50.0 release, there might have been a change in there that made the problem worse or happen easier/quicker, but given the lazy-apps issue has been around since the beginning I am not 100% sure without further testing or investigation.
    1. Note: things like enabling runtime metrics or profiling (which start background threads on load), or maybe adding some manual instrumentation that runs in the master process (we don't start our writer thread until the first trace is started) are causes we've seen in the past.
Kyle-Verhoog commented 3 years ago

@gmlwo530 thanks for following up! I'm going to close out the issue since it has been resolved. Please let us know if you have any other issues 🙂

chirayu-miles commented 2 years ago

We're suddenly getting bitten by this. We use uwsgi and cannot switch to lazy-apps because it would greatly affect our startup times. It seems that we cannot use ddtrace-run. How can we initialize datadog in the uwsgidecorators.postfork handler? (The docs are heavily push folks to simply use ddtrace-run.)

Appreciate any help here.

brettlangdon commented 2 years ago

@chirayu-miles we do not recommend ddtrace-run at this time for use with uwsgi, instead if you could follow our documentation here to use: --enable-threads --import=ddtrace.bootstrap.sitecustomize.

https://ddtrace.readthedocs.io/en/stable/advanced_usage.html#uwsgi

We suggest --lazy-apps as well since we have found the messaging easier to just always recommend it, but your app may work fine without it.

We have also opened an upstream fix against uwsgi to try and resolve the source of this issue.

https://github.com/unbit/uwsgi/issues/2387 https://github.com/unbit/uwsgi/pull/2388