DataDog / dd-trace-py

Datadog Python APM Client
553 stars 415 forks source link

trace (14533373b) larger than payload limit (8000000b), dropping #3836

Closed miohtama closed 1 year ago

miohtama commented 2 years ago

Which version of dd-trace-py are you using?

ddtrace                         0.50.4                    Datadog tracing code

Which version of pip are you using?


Which version of the libraries are you using?

aiohttp                         3.8.1                     Async http client/server framework (asyncio)
aiosignal                       1.2.0                     aiosignal: a list of registered asynchronous callbacks
alabaster                       0.7.12                    A configurable sidebar-enabled Sphinx theme
alembic                         1.7.7                     A database migration tool for SQLAlchemy.
appnope                         0.1.3                     Disable App Nap on macOS >= 10.9
argon2-cffi                     21.3.0                    The secure Argon2 password hashing algorithm.
argon2-cffi-bindings            21.2.0                    Low-level CFFI bindings for Argon2
asttokens                       2.0.5                     Annotate AST trees with source code positions
async-timeout                   4.0.2                     Timeout context manager for asyncio programs
attrs                           21.4.0                    Classes Without Boilerplate
babel                           2.10.1                    Internationalization utilities
backcall                        0.2.0                     Specifications for callback functions passed in to an API
base58                          2.1.1                     Base58 and Base58Check implementation.
beautifulsoup4                  4.11.1                    Screen-scraping library
bitarray                        1.2.2                     efficient arrays of booleans -- C extension
bleach                          5.0.0                     An easy safelist-based HTML-sanitizing tool.
cached-property                 1.5.2                     A decorator for caching properties in classes.
cachetools                      5.0.0                     Extensible memoizing collections and decorators
certifi                         2021.10.8                 Python package for providing Mozilla's CA Bundle.
cffi                            1.15.0                    Foreign Function Interface for Python calling C code.
charset-normalizer              2.0.12                    The Real First Universal Charset Detector. Open, modern and actively maintained alternative to Chardet.
chart-studio                    1.1.0                     Utilities for interfacing with plotly's Chart Studio
cli-helpers                     2.2.1                     Helpers for building command-line apps
click                           7.1.2                     Composable command line interface toolkit
coloredlogs                     15.0.1                    Colored terminal output for Python's logging module
colorlover                      0.3.0                     Color scales for IPython notebook
configobj                       5.0.6                     Config file reading, writing and validation.
cufflinks                       0.17.3                    Productivity Tools for Plotly + Pandas
cycler                          0.11.0                    Composable style cycles
cytoolz                         0.11.2                    Cython implementation of Toolz: High performance functional utilities
dataclasses-json                0.5.7                     Easily serialize dataclasses to and from JSON
ddtrace                         0.50.4                    Datadog tracing code
debugpy                         1.6.0                     An implementation of the Debug Adapter Protocol for Python
decorator                       5.1.1                     Decorators for Humans
defusedxml                      0.7.1                     XML bomb protection for Python stdlib modules
deprecated                      1.2.13                    Python @deprecated decorator to deprecate old python classes, functions or methods.
discord-webhook                 0.15.0                    execute discord webhooks
docutils                        0.17.1                    Docutils -- Python Documentation Utilities
dramatiq                        1.13.0                    Background Processing for Python 3.
entrypoints                     0.4                       Discover and load entry points from installed packages.
eth-abi                         2.1.1                     eth_abi: Python utilities for working with Ethereum ABI definitions, especially encoding and decoding
eth-account                     0.5.7                     eth-account: Sign Ethereum transactions and messages with local private keys
eth-bloom                       1.0.4                     Python implementation of the Ethereum Trie structure
eth-hash                        0.3.2                     eth-hash: The Ethereum hashing function, keccak256, sometimes (erroneously) called sha3
eth-keyfile                     0.5.1                     A library for handling the encrypted keyfiles used to store ethereum private keys.
eth-keys                        0.3.4                     Common API for Ethereum key operations.
eth-rlp                         0.2.1                     eth-rlp: RLP definitions for common Ethereum objects in Python
eth-tester                      0.6.0b6                   Tools for testing Ethereum applications.
eth-typing                      2.3.0                     eth-typing: Common type annotations for ethereum python packages
eth-utils                       1.10.0                    eth-utils: Common utility functions for python code that interacts with Ethereum
executing                       0.8.3                     Get the currently executing AST node of a frame, and other information
fastjsonschema                  2.15.3                    Fastest Python implementation of JSON schema
fonttools                       4.33.3                    Tools to manipulate font files
frozenlist                      1.3.0                     A list-like structure which implements
futureproof                     0.3.1                     Bulletproof concurrent.futures
gprof2dot                       2021.2.21                 Generate a dot graph from the output of several profilers.
gql                             2.0.0                     GraphQL client for Python
graphql-core                    2.3.2                     GraphQL implementation for Python
greenlet                        1.1.2                     Lightweight in-process concurrent programming
hexbytes                        0.2.2                     hexbytes: Python `bytes` subclass that decodes hex, with a readable console output
humanfriendly                   10.0                      Human friendly output for text interfaces using Python
idna                            3.3                       Internationalized Domain Names in Applications (IDNA)
imagesize                       1.3.0                     Getting image size from png/jpeg/jpeg2000/gif file
importlib-metadata              4.11.3                    Read metadata from Python packages
ipdb                            0.13.9                    IPython-enabled pdb
ipfshttpclient                  0.8.0a2                   Python IPFS HTTP CLIENT library
ipykernel                       6.13.0                    IPython Kernel for Jupyter
ipython                         8.3.0                     IPython: Productive Interactive Computing
ipython-genutils                0.2.0                     Vestigial utilities from IPython
ipywidgets                      7.7.0                     IPython HTML widgets for Jupyter
jedi                            0.18.1                    An autocompletion tool for Python that can be used for text editors.
jinja2                          3.1.2                     A very fast and expressive template engine.
jsonschema                      4.5.1                     An implementation of JSON Schema validation for Python
jupyter-client                  7.3.1                     Jupyter protocol implementation and client libraries
jupyter-core                    4.10.0                    Jupyter core package. A base package on which Jupyter projects rely.
jupyterlab-pygments             0.2.2                     Pygments theme using JupyterLab CSS variables
jupyterlab-widgets              1.1.0                     A JupyterLab extension.
kiwisolver                      1.4.2                     A fast implementation of the Cassowary constraint solver
lru-dict                        1.1.7                     An Dict like LRU container.
mako                            1.2.0                     A super-fast templating language that borrows the best ideas from the existing templating languages.
markupsafe                      2.1.1                     Safely add untrusted strings to HTML/XML markup.
marshmallow                     3.15.0                    A lightweight library for converting complex datatypes to and from native Python datatypes.
marshmallow-enum                1.5.1                     Enum field for Marshmallow
matplotlib                      3.5.2                     Python plotting package
matplotlib-inline               0.1.3                     Inline Matplotlib backend for Jupyter
mistune                         0.8.4                     The fastest markdown parser in pure Python
more-itertools                  8.13.0                    More routines for operating on iterables, beyond itertools
mplfinance                      0.12.9b0                  Utilities for the visualization, and visual analysis, of financial data
multiaddr                       0.0.9                     Python implementation of jbenet's multiaddr
multidict                       6.0.2                     multidict implementation
multiprocessing-logging         0.3.3                     Logger for multiprocessing applications
mypy-extensions                 0.4.3                     Experimental type system extensions for programs checked with the mypy typechecker.
nbclient                        0.6.3                     A client library for executing notebooks. Formerly nbconvert's ExecutePreprocessor.
nbconvert                       6.5.0                     Converting Jupyter Notebooks
nbformat                        5.4.0                     The Jupyter Notebook format
nbsphinx                        0.8.9                     Jupyter Notebook Tools for Sphinx
nest-asyncio                    1.5.5                     Patch asyncio to allow nested event loops
netaddr                         0.8.0                     A network address manipulation library for Python
notebook                        6.4.11                    A web-based notebook environment for interactive computing
numpy                           1.22.3                    NumPy is the fundamental package for array computing with Python.
opentracing                     2.4.0                     OpenTracing API for Python. See documentation at
packaging                       21.3                      Core utilities for Python packages
pandas                          1.4.2                     Powerful data structures for data analysis, time series, and statistics
pandocfilters                   1.5.0                     Utilities for writing pandoc filters in python
parsimonious                    0.8.1                     (Soon to be) the fastest pure-Python PEG parser I could muster
parso                           0.8.3                     A Python Parser
pendulum                        2.1.2                     Python datetimes made easy
pexpect                         4.8.0                     Pexpect allows easy control of interactive console applications.
pgcli                           3.4.1                     CLI for Postgres Database. With auto-completion and syntax highlighting.
pgspecial                       1.13.1                    Meta-commands handler for Postgres Database.
pickleshare                     0.7.5                     Tiny 'shelve'-like database with concurrency support
pillow                          9.1.0                     Python Imaging Library (Fork)
plotly                          5.8.0                     An open-source, interactive data visualization library for Python
pluggy                          0.13.1                    plugin and hook calling mechanisms for python
prometheus-client               0.14.1                    Python client for the Prometheus monitoring system.
promise                         2.3                       Promises/A+ implementation for Python
prompt-toolkit                  3.0.29                    Library for building powerful interactive command lines in Python
protobuf                        3.20.1                    Protocol Buffers
psutil                          5.9.0                     Cross-platform lib for process and system monitoring in Python.
psycopg2                        2.9.3                     psycopg2 - Python-PostgreSQL Database Adapter
ptyprocess                      0.7.0                     Run a subprocess in a pseudo terminal
pure-eval                       0.2.2                     Safely evaluate AST nodes without side effects
py                              1.11.0                    library with cross-python path, ini-parsing, io, code, log facilities
py-ecc                          5.2.0                     Elliptic curve crypto in python including secp256k1 and alt_bn128
py-evm                          0.5.0a3                   Python implementation of the Ethereum Virtual Machine
py-geth                         3.8.0                     Run Go-Ethereum as a subprocess
pyarrow                         7.0.0                     Python library for Apache Arrow
pycparser                       2.21                      C parser in Python
pycryptodome                    3.14.1                    Cryptographic library for Python
pyethash                        0.1.27                    Python wrappers for ethash, the ethereum proof of workhashing function
pygments                        2.12.0                    Pygments is a syntax highlighting package written in Python.
pyparsing                       3.0.9                     pyparsing module - Classes and methods to define and execute parsing grammars
pyrsistent                      0.18.1                    Persistent/Functional/Immutable data structures
pysha3                          1.0.2                     SHA-3 (Keccak) for Python 2.7 - 3.5
pytest                          5.4.3                     pytest: simple powerful testing with Python
pytest-profiling                1.7.0                     Profiling plugin for py.test
python-dateutil                 2.8.2                     Extensions to the standard Python datetime module
python-logging-discord-handler  0.1.3                     Direct Python log output to Discord
python-logstash-tradingstrategy 0.5.0 eda6d47             Python logging handler for Logstash (forked).
python-redis-lock               3.7.0                     Lock context manager implemented via redis SETNX/BLPOP.
python-slugify                  5.0.2                     A Python Slugify application that handles Unicode
pytz                            2022.1                    World timezone definitions, modern and historical
pytzdata                        2020.1                    The Olson timezone database for Python.
pyzmq                           22.3.0                    Python bindings for 0MQ
redis                           4.3.1                     Python client for Redis database and key-value store
requests                        2.27.1                    Python HTTP for Humans.
retry                           0.9.2                     Easy to use retry decorator.
retrying                        1.3.3                     Retrying
rlp                             2.0.1                     A package for Recursive Length Prefix encoding and decoding
rx                              1.6.1                     Reactive Extensions (Rx) for Python
scipy                           1.8.0                     SciPy: Scientific Library for Python
semantic-version                2.9.0                     A library implementing the 'SemVer' scheme.
send2trash                      1.8.0                     Send file to trash natively under Mac OS X, Windows and Linux.
setproctitle                    1.2.3                     A Python module to customize the process title
setuptools-scm                  6.4.2                     the blessed package to manage your versions by scm tags
six                             1.16.0                    Python 2 and 3 compatibility utilities
snowballstemmer                 2.2.0                     This package provides 29 stemmers for 28 languages generated from Snowball algorithms.
sortedcontainers                2.4.0                     Sorted Containers -- Sorted List, Sorted Dict, Sorted Set
soupsieve                       2.3.2.post1               A modern CSS selector implementation for Beautiful Soup.
sphinx                          4.5.0                     Python documentation generator
sphinxcontrib-applehelp         1.0.2                     sphinxcontrib-applehelp is a sphinx extension which outputs Apple help books
sphinxcontrib-devhelp           1.0.2                     sphinxcontrib-devhelp is a sphinx extension which outputs Devhelp document.
sphinxcontrib-htmlhelp          2.0.0                     sphinxcontrib-htmlhelp is a sphinx extension which renders HTML help files
sphinxcontrib-jsmath            1.0.1                     A sphinx extension which renders display math in HTML via JavaScript
sphinxcontrib-qthelp            1.0.3                     sphinxcontrib-qthelp is a sphinx extension which outputs QtHelp document.
sphinxcontrib-serializinghtml   1.1.5                     sphinxcontrib-serializinghtml is a sphinx extension which outputs "serialized" HTML files (json and pickle).
sqlalchemy                      1.4.18                    Database Abstraction Library
sqlalchemy-utils                0.37.9                    Various utility functions for SQLAlchemy.
sqlparse                        0.4.2                     A non-validating SQL parser.
stack-data                      0.2.0                     Extract data from python stack frames and tracebacks for informative displays
tabulate                        0.8.9                     Pretty-print tabular data
tenacity                        8.0.1                     Retry code until it succeeds
terminado                       0.13.3                    Tornado websocket backend for the Xterm.js Javascript terminal emulator library.
text-unidecode                  1.3                       The most basic Text::Unidecode port
tinycss2                        1.1.1                     A tiny CSS parser
toml                            0.10.2                    Python Library for Tom's Obvious, Minimal Language
tomli                           2.0.1                     A lil' TOML parser
toolz                           0.11.2                    List processing tools and functional utilities
tornado                         6.1                       Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed.
tqdm                            4.64.0                    Fast, Extensible Progress Meter
trading-strategy                0.6.9                     Algorithmic trading and quantitative financial analysis framework for decentralised exchanges and blockchains
trading-strategy-backtrader     0.1                       BackTesting Engine (forked)
traitlets                       5.2.0                     Traitlets Python configuration system
trie                            2.0.0a5                   Python implementation of the Ethereum Trie structure
typing-extensions                       Backported and Experimental Type Hints for Python 3.5+
typing-inspect                  0.7.1                     Runtime inspection utilities for typing module.
ujson                           5.2.0                     Ultra fast JSON encoder and decoder for Python
urllib3                         1.26.9                    HTTP library with thread-safe connection pooling, file post, and more.
varint                          1.0.2                     Simple python varint implementation
waitress                        2.1.1                     Waitress WSGI server
wcwidth                         0.2.5                     Measures the displayed width of unicode strings in a terminal
web3                            5.29.0          
web3-ethereum-defi              0.9 ../web3-ethereum-defi Web3-Ethereum-DeFi is a library for smart contracts, DeFi trading (Uniswap, PancakeSwap), Ethereum JSON-RPC, EVM transactions and autom...
webencodings                    0.5.1                     Character encoding aliases for legacy web content
webob                           1.8.7                     WSGI request and response object
websockets                      9.1                       An implementation of the WebSocket Protocol (RFC 6455 & 7692)
webtest                         2.0.35                    Helper to test WSGI applications
widgetsnbextension              3.6.0                     IPython HTML widgets for Jupyter
wrapt                           1.14.1                    Module for decorators, wrappers and monkey patching.
yarl                            1.7.2                     Yet another URL library
zipp                            3.8.0                     Backport of pathlib-compatible object wrapper for zip files

How can we reproduce your problem?

I have no idea.

What is the result that you get?

On my production server that runs a long-running command-line application with manually inserted traces, sometimes I get a warning

trace (14533373b) larger than payload limit (8000000b), dropping

What is the result that you expected?

The warning should hint

I am doing tested traces using OpenTracing API, so I suspect having a long-running parent trace might cause this.


As a workaround I just poke ddtrace internals directly:

import os
# Work around of the following warning/error:
# trace (14_533_373b) larger than payload limit (8000000b), dropping
# by setting tracer write buffer to 128M
# See ddtrace/internal/
# Note that ddtrace evalutes this during the import time,
# so make sure you put this code in before importing Datadog.
dd_trace_buffer_val = str(128_000_000)
os.environ["DD_TRACE_WRITER_BUFFER_SIZE_BYTES"] = dd_trace_buffer_val
os.environ["DD_TRACE_WRITER_MAX_PAYLOAD_SIZE_BYTES"] = dd_trace_buffer_val
miohtama commented 2 years ago

After doing this change, I am hitting another error

 failed to send traces to Datadog Agent at http://localhost:8126: HTTP error status 413, reason Request Entity Too Large
miohtama commented 2 years ago

I am now trying with partial flushes and lower sample rate if this would mitigate the issue

os.environ["DD_TRACER_PARTIAL_FLUSH_ENABLED"] = "true"
# We have a lot (millions) of SQL traces
# Trace only 10% of all
os.environ["DD_TRACE_SAMPLE_RATE"] = "0.1"
majorgreys commented 2 years ago

@miohtama The configuration options for enabling partial flushing are not documented as you describe. I will correct this as a follow-up to your issue.

We offer the partial flush and payload size configuration options you found as mitigations at the moment.

For partial flushing, the minimum version requirements are ddtrace v1.1.1 and Datadog Agent v7.25.0.

For payload sizes, the Datadog Agent in v7.21.0 increased the maximum trace payload size from 10mb to 50mb.

I am very interested in your use case of a long-running command line program. Capturing all the work in the run of such a command as a single trace can be valuable. I wonder if you could share more on how you see this data being useful.

miohtama commented 2 years ago

Thank you for a prompt and informative reply. I indeed noticed still getting dropped more traces than accepted (dropped: 1, accepted: 0) with my example config and ddtrace 0.50.4.

I have upgraded to ddtrace to 1.2.0 and will see if partial flushes will get rid of my errors.

majorgreys commented 2 years ago

@miohtama Did enabling partial flushing resolve the problem for you?

miohtama commented 2 years ago

Hey yes. I managed to solve it with a partial flush. Eventually, I ended up using the following env var hack:

os.environ["DD_TRACER_PARTIAL_FLUSH_ENABLED"] = "true"
os.environ["DD_TRACE_SAMPLE_RATE"] = "0.1"   

Though I am facing another issue: nested ddtrace.opentracer.Tracer do not seem to work with Datadog the service. I did not yet have time to investigate why is this and if it is related to partial flushs.

emmettbutler commented 1 year ago

@miohtama if you're still having an issue with nested Tracer objects, please open a new issue describing the problem. Thanks again for the contribution!