DataDog / dd-trace-py

Datadog Python APM Client
https://ddtrace.readthedocs.io/
Other
532 stars 408 forks source link

"Error adding sample to profile (ddog_prof_Profile_add failed: Duplicate label on sample" error after enabling experimental profilers #10245

Open gibsondan opened 4 weeks ago

gibsondan commented 4 weeks ago

Summary of problem

After enabling the experimental stack sampling feature for the datadog profiler (by setting DD_PROFILING_EXPORT_LIBDD_ENABLED and DD_PROFILING_STACK_V2_ENABLED to true in my process), the process repeatedly logs errors that look like this: Error adding sample to profile (ddog_prof_Profile_add failed: Duplicate label on sample: Label { key: "task id", str: None, num: 140071738420352, num_unit: None } Label { key: "task id", str: None, num: 140071738420352, num_unit: None })

I can see that there is a duplicate task id value being set, but am unsure how to prevent that from happening.

From what i can tell, the profiler is working when this is happening, but is only logging CPU information, not memory information.

Which version of dd-trace-py are you using? 2.9.5

Which version of pip are you using? 24.1.2

Which libraries and their versions are you using?

`pip freeze` agate==1.7.1 aiohappyeyeballs==2.3.5 aiohttp==3.10.3 aiosignal==1.3.1 alembic==1.12.0 amqp==5.2.0 aniso8601==9.0.1 annotated-types==0.6.0 anyio==3.7.1 asgi-idempotency-header==0.2.0 attrs==23.2.0 Authlib==1.3.1 awswrangler==3.7.2 Babel==2.14.0 backoff==2.2.1 beautifulsoup4==4.12.3 bleach==6.1.0 boto3==1.34.72 boto3-stubs==1.34.161 botocore==1.34.72 botocore-stubs==1.34.161 bytecode==0.15.1 cachetools==5.3.3 cattrs==23.2.3 certifi==2024.2.2 cffi==1.16.0 charset-normalizer==3.3.2 click==8.1.7 colorama==0.4.6 coloredlogs==14.0 croniter==2.0.3 cryptography==42.0.5 daff==1.3.46 datadog==0.49.1 dbt-adapters==1.4.1 dbt-common==1.7.0 dbt-core==1.8.5 dbt-extractor==0.5.1 dbt-semantic-interfaces==0.5.1 ddsketch==3.0.1 ddtrace==2.9.5 deepdiff==7.0.1 defusedxml==0.7.1 Deprecated==1.2.14 docstring_parser==0.16 elastic-transport==8.15.0 elasticsearch==8.15.0 envier==0.5.1 fastapi==0.110.0 fastjsonschema==2.19.1 filelock==3.15.4 frozenlist==1.4.1 fsspec==2024.3.1 gitdb==4.0.11 github3.py==4.0.1 GitPython==3.1.42 google-api-core==2.18.0 google-auth==2.29.0 google-cloud-recaptcha-enterprise==1.19.0 googleapis-common-protos==1.63.0 gql==3.5.0 graphene==3.3 graphql-core==3.2.3 graphql-relay==3.2.0 greenlet==3.0.3 grpcio==1.62.1 grpcio-health-checking==1.62.1 grpcio-status==1.62.1 gunicorn==23.0.0 h11==0.14.0 hiredis==2.3.2 httpcore==1.0.5 httptools==0.6.1 httpx==0.27.0 humanfriendly==10.0 idna==3.7 importlib-metadata==6.11.0 isodate==0.6.1 itsdangerous==2.1.2 Jinja2==3.1.4 jmespath==1.0.1 jsonschema==4.21.1 jsonschema-specifications==2023.12.1 jupyter_client==8.6.1 jupyter_core==5.7.2 jupyterlab_pygments==0.3.0 kombu==5.2.4 kubernetes==29.0.0 leather==0.4.0 Logbook==1.5.3 lxml==4.9.4 Mako==1.3.2 markdown-it-py==3.0.0 MarkupSafe==2.1.5 mashumaro==3.12 mdurl==0.1.2 minimal-snowplow-tracker==0.0.2 mistune==3.0.2 more-itertools==10.2.0 msgpack==1.0.8 multidict==6.0.5 mypy-boto3-cloudformation==1.34.111 mypy-boto3-dynamodb==1.34.148 mypy-boto3-ec2==1.34.159 mypy-boto3-lambda==1.34.77 mypy-boto3-rds==1.34.152 mypy-boto3-s3==1.34.160 mypy-boto3-sqs==1.34.121 nbclient==0.10.0 nbconvert==7.16.3 nbformat==5.10.3 networkx==3.2.1 numpy==1.26.4 oauthlib==3.2.2 opentelemetry-api==1.23.0 ordered-set==4.1.0 orjson==3.10.0 packaging==24.0 pandas==2.2.1 pandocfilters==1.5.1 parsedatetime==2.6 passlib==1.7.4 pathspec==0.11.2 pdpyras==5.2.0 pendulum==2.1.2 pex==2.2.2 platformdirs==4.2.0 prometheus_client==0.20.0 prompt-toolkit==3.0.36 proto-plus==1.23.0 protobuf==4.25.3 psutil==6.0.0 psycopg2-binary==2.9.9 pyarrow==15.0.2 pyasn1==0.6.0 pyasn1_modules==0.4.0 pycparser==2.21 pydantic==2.6.4 pydantic_core==2.16.3 Pygments==2.17.2 PyJWT==2.8.0 PyNaCl==1.5.0 python-dateutil==2.9.0.post0 python-dotenv==1.0.1 python-gitlab==4.4.0 python-multipart==0.0.9 python-slugify==8.0.4 python3-saml==1.16.0 pytimeparse==1.1.8 pytz==2024.1 pytzdata==2020.1 PyYAML==6.0.1 pyzmq==25.1.2 questionary==2.0.1 redis==5.0.3 referencing==0.34.0 requests==2.32.3 requests-oauthlib==2.0.0 requests-toolbelt==1.0.0 rich==13.7.1 rpds-py==0.18.0 rsa==4.9 s3transfer==0.10.1 segment-analytics-python==2.3.2 shellingham==1.5.4 six==1.16.0 slack-bolt==1.18.1 slack_sdk==3.27.1 smmap==5.0.1 sniffio==1.3.1 soupsieve==2.5 SQLAlchemy==1.4.52 sqlglot==23.2.0 sqlglotrs==0.1.3 sqlparse==0.5.1 starlette==0.36.3 stripe==3.5.0 structlog==24.1.0 tabulate==0.9.0 text-unidecode==1.3 tinycss2==1.2.1 tomli==2.0.1 toposort==1.10 tornado==6.4 tqdm==4.66.5 traitlets==5.14.2 typer==0.11.0 types-awscrt==0.21.2 types-boto==2.49.18.20240205 types-pyOpenSSL==24.0.0.20240311 types-redis==4.6.0.20240311 types-s3transfer==0.10.1 types-stripe==3.5.2.20240106 typing_extensions==4.10.0 tzdata==2024.1 ulid-py==1.1.0 universal_pathlib==0.2.2 uritemplate==4.1.1 urllib3==2.2.2 uv==0.2.36 uvicorn==0.29.0 uvloop==0.19.0 validators==0.23.2 vine==5.1.0 watchdog==4.0.0 watchfiles==0.21.0 wcwidth==0.2.13 webencodings==0.5.1 websocket-client==1.7.0 websockets==10.4 wrapt==1.16.0 xmlsec==1.3.13 xmltodict==0.13.0 yarl==1.9.4 zipp==3.18.1

How can we reproduce your problem?

What is the result that you get?

Logspew like the following, no memory profiling

Error adding sample to profile (ddog_prof_Profile_add failed: Duplicate label on sample: Label { key: "task id", str: None, num: 140071724400704, num_unit: None } Label { key: "task id", str: None, num: 140071724400704, num_unit: None })
Error adding sample to profile (ddog_prof_Profile_add failed: Duplicate label on sample: Label { key: "task id", str: None, num: 140071724400704, num_unit: None } Label { key: "task id", str: None, num: 140071724400704, num_unit: None })
Error adding sample to profile (ddog_prof_Profile_add failed: Duplicate label on sample: Label { key: "task id", str: None, num: 140072623446848, num_unit: None } Label { key: "task id", str: None, num: 140072623446848, num_unit: None })
Error adding sample to profile (ddog_prof_Profile_add failed: Duplicate label on sample: Label { key: "task id", str: None, num: 140072623446848, num_unit: None } Label { key: "task id", str: None, num: 140072623446848, num_unit: None })
Error adding sample to profile (ddog_prof_Profile_add failed: Duplicate label on sample: Label { key: "task id", str: None, num: 140072623443968, num_unit: None } Label { key: "task id", str: None, num: 140072623443968, num_unit: None })

What is the result that you expected?

No logspew, memory profiling

gibsondan commented 4 weeks ago

A couple other details in case they're helpful: This is a starlette + uvicorn app running a graphql server. We've observed in in two different services with a similar setup but different business logic and schemas, after applying the experimental stack sampling.