DataDog / dd-trace-py

Datadog Python APM Client
https://ddtrace.readthedocs.io/
Other
510 stars 401 forks source link

Getting failed to send traces to Datadog Agent when DD_TRACE_ENABLED=false #4179

Closed fredrikaverpil closed 7 months ago

fredrikaverpil commented 1 year ago

Which version of dd-trace-py are you using?

1.4.4

Which version of pip are you using?

22.2.2

Which version of the libraries are you using?

N/A

How can we reproduce your problem?

N/A

What is the result that you get?

Even if I have DD_TRACE_ENABLED=false I get this a lot of times, when running code locally:

ddtrace.internal.logger:handle:158 - failed to send traces to Datadog Agent at http://localhost:8126/v0.4/traces

I am in my code calling e.g. span.set_tags and span.set_exc_info(...) but I wouldn't expect traces to be sent off to Datadog because of this.

Do you have any idea what could be triggering a trace to be sent, even if DD_TRACE_ENABLED=false?

What is the result that you expected?

I would not expect an attempt to send traces to Datadog when DD_TRACE_ENABLED=false.

More info:

ERROR    ddtrace.internal.writer:writer.py:567 failed to send traces to Datadog Agent at http://localhost:8126/v0.4/traces
Traceback (most recent call last):
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 407, in __call__
    result = fn(*args, **kwargs)
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/ddtrace/internal/writer.py", line 446, in _send_payload
    response = self._put(payload, headers)
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/ddtrace/internal/writer.py", line 398, in _put
    self._conn.request("PUT", self._endpoint, data, headers)
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 1282, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 1328, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 1277, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 1037, in _send_output
    self.send(msg)
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 975, in send
    self.connect()
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 941, in connect
    self.sock = self._create_connection(
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/socket.py", line 845, in create_connection
    raise err
  File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/socket.py", line 833, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 61] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/ddtrace/internal/writer.py", line 559, in flush_queue
    self._retry_upload(self._send_payload, encoded, n_traces)
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 404, in __call__
    do = self.iter(retry_state=retry_state)
  File "/Users/fredrik/code/repos/<REDACTED>/.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 361, in iter
    raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x110f12470 state=finished raised ConnectionRefusedError>]
Yun-Kim commented 1 year ago

Hi @fredrikaverpil, I'm trying to reproduce this on my end but I'm not seeing the first error log ddtrace.internal.logger:handle:158 - failed to send traces to Datadog Agent at http://localhost:8126/v0.4/traces that you mentioned in your traceback. Can you clarify where you're seeing that exact error from, and if possible could you add that traceback to this issue? Thanks!

fredrikaverpil commented 1 year ago

Ok, I'm pretty sure I found what is causing this why I see this traceback.

I have a custom logger, which is based on loguru. When setting up this logger, I also intercept all standard library logging:

import logging
from loguru import logger

class LoguruInterceptHandler(logging.Handler):
    """Intercept standard logging messages toward Loguru sinks.

    Notes:
        This will e.g. make it possible to see the SQL debug logs from
        ORMs such as Peewee.
    """

    def emit(self, record):
        # Get corresponding Loguru level if it exists
        try:
            level = loguru_logger.level(record.levelname).name
        except ValueError:
            level = record.levelno

        # Find caller from where originated the logged message
        frame, depth = logging.currentframe(), 2
        while frame.f_code.co_filename == logging.__file__:
            frame = frame.f_back
            depth += 1

        loguru_logger.opt(depth=depth, exception=record.exc_info).bind(
            metadata={"intercepted_from_std_logging": True}
        ).log(
            level,
            str(record.getMessage()),
        )

def setup_interceptor():
    # intercept everything at the root logger
    logging.root.handlers = [LoguruInterceptHandler()]
    logging.root.setLevel("INFO")

    # remove every other logger's handlers
    # and propagate to root logger
    for name in logging.root.manager.loggerDict.keys():
        logging.getLogger(name).handlers = []
        logging.getLogger(name).propagate = True

setup_interceptor()

This code executes late in my service. Meaning, it executes when most libraries etc have created their loggers.

When I leave this out, the traceback no longer appear in the terminal when e.g. running pytest in my project. But I believe the ddtrace error log actually is just silenced and is executing anyway.

fredrikaverpil commented 1 year ago

Could it be that the ddtrace library silently logs this message (here), and my interceptor simply found it and also picked up the raised exception traceback?

If I add a print("hello") next to this log.error statement, I see it being printed when e.g. running pytest in my project. And I have DD_TRACE_ENABLED=false.

fredrikaverpil commented 1 year ago

If the above assumption is correct, I don't think ddtrace should even attempt to send traces when DD_TRACE_ENABLED=false (and then also not have to log that error).

Yun-Kim commented 1 year ago

Hi @fredrikaverpil, just so I can clarify what might be the problem could you add a few debug statements next to the log.error statement you mentioned? Thanks!

fredrikaverpil commented 1 year ago

@Yun-Kim Sure!

I added some prints:

    def flush_queue(self, raise_exc=False):
        print("In flush_queue")
        # type: (bool) -> None
        try:
            ...
                else:
                    print("Debug logging for Datadog", flush=True)
                    print(f"{n_traces=}", flush=True)
                    print(f"{encoded=}", flush=True)
                    print(f"{os.getenv('DD_TRACE_ENABLED')=}", flush=True)

                    log.error("failed to send traces to Datadog Agent at %s", self._agent_endpoint, exc_info=True)

The results:

n_traces=1
encoded=b'\x91\x91\x89\xa8trace_id\xcfM\x08\xf2\xdcVq\xbeG\xa7span_id\xcf6\x1d\x8aA\xc9\x9c\x99\xce\xa7service\xb8conversations-migrations\xa8resource\xadinitContainer\xa4name\xd9 conversations.run_migrations.run\xa5start\xcf\x17\x14\n\xa3\xdc\x7f\x1a`\xa8duration\xce\x1ae\xc4H\xa4meta\x82\xaaruntime-id\xd9 e1b588fb6af148f9a628ebe93351bc40\xa8_dd.p.dm\xa2-0\xa7metrics\x85\xad_dd.agent_psr\xcb?\xf0\x00\x00\x00\x00\x00\x00\xaasystem.pid\xcd\xbbe\xad_dd.top_level\x01\xb5_sampling_priority_v1\x01\xad_dd.tracer_kr\xcb?\xf0\x00\x00\x00\x00\x00\x00'
os.getenv('DD_TRACE_ENABLED')='false'

I'm not using tracer.configure() and I'm only relying on environment variables. However, I also printed outenabled, but it doesn't get printed. It seems importing enabled somehow stops print from printing anything:

                else:
                    from ddtrace.tracer import enabled
                    print(f"{enabled=}", flush=True)

                    log.error("failed to send traces to Datadog Agent at %s", self._agent_endpoint, exc_info=True)

At first, I noticed that I did not have my db container running, and so I received a very large value full of tracebacks in encoded. But after starting up that container, I got the above value.

I ran pytest to trigger all of this. The full output is below:

Details ``` āÆ pytest --count=100 -s -k test_s3 ====================================================== test session starts ======================================================= platform darwin -- Python 3.10.5, pytest-7.1.2, pluggy-1.0.0 Using --randomly-seed=107146139 rootdir: /Users/fredrik/code/repos/, configfile: pyproject.toml, testpaths: tests plugins: anyio-3.6.1, Faker-14.1.0, freezegun-0.4.2, ddtrace-1.4.4, randomly-3.12.0, mock-3.8.2, dotenv-0.5.2, repeat-0.9.1, cov-3.0.0 collected 22693 items / 22193 deselected / 500 selected tests/pytest/unit/message/test_s3_url.py ........................In flush_queue .....Debug logging for Datadog n_traces=1 encoded=b'\x91\x91\x89\xa8trace_id\xcfM\x08\xf2\xdcVq\xbeG\xa7span_id\xcf6\x1d\x8aA\xc9\x9c\x99\xce\xa7service\xb8conversations-migrations\xa8resource\xadinitContainer\xa4name\xd9 conversations.run_migrations.run\xa5start\xcf\x17\x14\n\xa3\xdc\x7f\x1a`\xa8duration\xce\x1ae\xc4H\xa4meta\x82\xaaruntime-id\xd9 e1b588fb6af148f9a628ebe93351bc40\xa8_dd.p.dm\xa2-0\xa7metrics\x85\xad_dd.agent_psr\xcb?\xf0\x00\x00\x00\x00\x00\x00\xaasystem.pid\xcd\xbbe\xad_dd.top_level\x01\xb5_sampling_priority_v1\x01\xad_dd.tracer_kr\xcb?\xf0\x00\x00\x00\x00\x00\x00' os.getenv('DD_TRACE_ENABLED')='false' 2022-09-12 08:57:42.959 | ERROR | ddtrace.internal.logger:handle:158 - failed to send traces to Datadog Agent at http://localhost:8126/v0.4/traces Traceback (most recent call last): File "/Users/fredrik/code/repos//.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 407, in __call__ result = fn(*args, **kwargs) File "/Users/fredrik/code/repos//.venv/lib/python3.10/site-packages/ddtrace/internal/writer.py", line 446, in _send_payload response = self._put(payload, headers) File "/Users/fredrik/code/repos//.venv/lib/python3.10/site-packages/ddtrace/internal/writer.py", line 398, in _put self._conn.request("PUT", self._endpoint, data, headers) File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 1282, in request self._send_request(method, url, body, headers, encode_chunked) File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 1328, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 1277, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 1037, in _send_output self.send(msg) File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 975, in send self.connect() File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/http/client.py", line 941, in connect self.sock = self._create_connection( File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/socket.py", line 845, in create_connection raise err File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/socket.py", line 833, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 61] Connection refused The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/threading.py", line 973, in _bootstrap self._bootstrap_inner() File "/Users/fredrik/.pyenv/versions/3.10.5/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/Users/fredrik/code/repos/.venv/lib/python3.10/site-packages/ddtrace/internal/periodic.py", line 73, in run self._target() File "/Users/fredrik/code/repos//.venv/lib/python3.10/site-packages/ddtrace/internal/writer.py", line 587, in periodic self.flush_queue(raise_exc=False) > File "/Users/fredrik/code/repos//.venv/lib/python3.10/site-packages/ddtrace/internal/writer.py", line 560, in flush_queue self._retry_upload(self._send_payload, encoded, n_traces) File "/Users/fredrik/code/repos/.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 404, in __call__ do = self.iter(retry_state=retry_state) File "/Users/fredrik/code/repos//.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 361, in iter raise retry_exc from fut.exception() tenacity.RetryError: RetryError[] .....................In flush_queue ......................In flush_queue ......................In flush_queue ......................In flush_queue ......................In flush_queue .....................In flush_queue .....................In flush_queue .......................In flush_queue ......................In flush_queue .......................In flush_queue ......................In flush_queue .......................In flush_queue .......................In flush_queue .....................In flush_queue ......................In flush_queue .......................In flush_queue .....................In flush_queue .....................In flush_queue .....................In flush_queue .......................In flush_queue ......................In flush_queue .......... ======================================================== warnings summary ======================================================== In flush_queue .venv/lib/python3.10/site-packages/pytest_freezegun.py:17: 45386 warnings /Users/fredrik/code/repos//.venv/lib/python3.10/site-packages/pytest_freezegun.py:17: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. if LooseVersion(pytest.__version__) < LooseVersion('3.6.0'): -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ---------- coverage: platform darwin, python 3.10.5-final-0 ---------- Coverage HTML written to dir htmlcov ===================================== 500 passed, 22193 deselected, 45386 warnings in 26.72s ===================================== In flush_queue ``` In case you're wondering what `In flush_queue` means, it's just me adding a print in this function: ```python def flush_queue(self, raise_exc=False): print("In flush_queue") # type: (bool) -> None ```
majorgreys commented 1 year ago

I'm not using tracer.configure() and I'm only relying on environment variables. However, I also printed out enabled, but it doesn't get printed. It seems importing enabled somehow stops print from printing anything:

@fredrikaverpil That import does not work. The enabled property is on the global tracer object, ddtrace.tracer.

You'll want to use something like this for debugging:

from ddtrace import tracer
print(f"{tracer.enabled=}", flush=True)
Yun-Kim commented 1 year ago

@fredrikaverpil Just to add on, how are you running ddtrace and setting the environment variable DD_TRACE_ENABLED=false?

fredrikaverpil commented 1 year ago

@fredrikaverpil Just to add on, how are you running ddtrace and setting the environment variable DD_TRACE_ENABLED=false?

We wrap our microservices with ddtrace-run in Dockerfile:

FROM python:3.10-slim-bullseye

...

ENTRYPOINT ["ddtrace-run", "python"]
CMD ["app.py"]

But when I run pytest I don't run ddtrace at all, and this is where I am having these issues. Somehow tracing is enabled when running pytest.

I'm running this command in my terminal with activated venv:

pytest --count=100 -s -k <test_name>

I have the following pytest packages installed:

pytest                      7.1.2
pytest-cov                  3.0.0
pytest-dotenv               0.5.2
pytest-freezegun            0.4.2
pytest-mock                 3.8.2
pytest-randomly             3.12.0
pytest-repeat               0.9.1
python-dateutil             2.8.2
python-dotenv               0.20.0

and I've got the following in my pyproject.toml:

[tool.pytest.ini_options]
testpaths = "tests"
env_files = ["env/local_test.env"]
addopts = "--cov --cov-report=html --log-level=INFO -rxXs"

In the .env file, I've got, among other things:

TEST=1
DD_TRACE_ENABLED=false
LOG_LEVEL="INFO"
PYTHONDEVMODE="1"
fredrikaverpil commented 1 year ago

You'll want to use something like this for debugging:


from ddtrace import tracer

print(f"{tracer.enabled=}", flush=True)

Right! I got this:

tracer.enabled=True
ZStriker19 commented 1 year ago

Hi Fred, taking over for Yun here.

Could you also try just printing out the environment variable value for DD_TRACE_ENABLED so we can check all our bases?

enabled = os.getenv("DD_TRACE_ENABLED") print(f"DD_TRACE_ENABLED= {enabled}", flush=True)

It seems that for some reason the tracer is still running with your test code despite you not running with ddtrace-run (unless hidden somewhere?) in the tests.

My original thought was that you were enabling the tracer somewhere in your code by running patch() or patch_all(), however I tested this and DD_TRACE_ENABLED=false actually keeps both ddtrace-run and the patch methods from generating traces. Therefore... I'm wondering if the environment variable DD_TRACE_ENABLED=false is getting picked up or not.

fredrikaverpil commented 1 year ago

Could you also try just printing out the environment variable value for DD_TRACE_ENABLED so we can check all our bases?

I actually already did that in https://github.com/DataDog/dd-trace-py/issues/4179#issuecomment-1243300395

Yeah, this is a mighty weird one. I'll dig into this tomorrow (late night here now) to see if I can narrow this down further. To me it seems something is enabling the tracer even if the environment variable is present and set to "false".

fredrikaverpil commented 1 year ago

I think I just found what is causing all this.

I have this in one place in my code that gets executed when running tests in pytest:

@tracer.wrap(service="xxx", resource="yyy")
def run():
    ...

As soon as I comment this line out, I can no longer see the error log and traceback. I've reproduced this now like 20 times, so I am 100% sure about this.

The @tracer.wrap is decorating in a function used by a pytest fixture which is used to setup/teardown db migrations for my integration tests. I added debug prints again inside of this function and this is the output:

@tracer.wrap(service="xxx", resource="yyy")
def run():
    logger.info(f"{os.getenv('DD_TRACE_ENABLED')=}")
    logger.info(f"{tracer.enabled=}")
2022-09-13 08:25:04.069 | INFO     | <REDACTED>:34 - os.getenv('DD_TRACE_ENABLED')='false'
2022-09-13 08:25:04.069 | INFO     | <REDACTED>:35 - tracer.enabled=True

What's weird here is that even if I remove the @tracer.wrap, I get the same logged output as above, indicating the the tracer is still enabled.

But I can 100% verify here that when removing @tracer.wrap, I no longer get the ddtrace error log + traceback.

ZStriker19 commented 1 year ago

That is odd as DD_TRACE_ENABLED should directly toggle tracer.enabled: https://github.com/DataDog/dd-trace-py/blob/1.x/ddtrace/tracer.py#L195

The only other thing you should be able to do to modify tracer.enabled afterwards is calling configure(): https://github.com/DataDog/dd-trace-py/blob/1.x/ddtrace/tracer.py#L317

I could only recreate this behavior when calling configure(). Going to sync with team and see what we can come up with.

fredrikaverpil commented 1 year ago

Could there be a regression issue?

See this issue: https://github.com/DataDog/dd-trace-py/issues/3671

Yun-Kim commented 1 year ago

@fredrikaverpil If this is a regression issue of the pytest plugin restarting a tracer, could you try disabling the plugin as specified on #3674? Specifically in your pytest config file, add:

[pytest]
addopts = -p no:ddtrace
fredrikaverpil commented 1 year ago

@Yun-Kim doing that made no difference. I still see the traceback and tracer.enabled=True.

So I just removed any ddtrace imports everywhere from my codebase. But I still have ddtrace installed in my venv. When I check sys.modules for ddtrace (from within a test running via pytest), I see it has somehow been imported anyways and shows up there. If I run pip uninstall ddtrace, I no longer see ddtrace in sys.modules (and there are no missing imports or crashes because my codebase is simply not importing ddtrace).

Do you have some sort of hook that plugs into pytest and enables the tracer?

Yun-Kim commented 1 year ago

@fredrikaverpil If this is a regression issue of the pytest plugin restarting a tracer, could you try disabling the plugin as specified on #3674? Specifically in your pytest config file, add:

[pytest]
addopts = -p no:ddtrace

Just to confirm, this did not disable the plugin on your end? We do have hooks in our pytest plugin which does start a global tracer, but as long as you set your environment variable DD_TRACE_ENABLED=false and disabled the plugin as specified above this shouldn't matter.

fredrikaverpil commented 1 year ago

Just to confirm, this did not disable the plugin on your end?

Correct. This did not disable the tracer.

I have this in my pyproject.toml:

[tool.pytest.ini_options]
testpaths = "tests"
env_files = ["env/local_test.env"]
addopts = "--cov --cov-report=html -rxXs"

...and so I added what you suggested:

[tool.pytest.ini_options]
testpaths = "tests"
env_files = ["env/local_test.env"]
addopts = "--cov --cov-report=html -rxXs -p no:ddtrace"

But the error log and the traceback still appears in the terminal, if I have a @tracer.wrap.

All in all, to me it seems there has to be something off with the pytest plugin, as I've removed all ddtrace imports and all ddtrace from my codebase, and ddtrace still shows up in sys.modules.

When I loop through sys.modules, the last key that contains the word ddtrace is ddtrace.contrib.pytest_bdd.plugin. Coincidence?

    mods = []
    for mod in sys.modules:
        if "ddtrace" in mod:
            mods.append(sys.modules[mod])

image

image

fredrikaverpil commented 1 year ago

From the ddtrace.contrib.pytest.__init__.py and ddtrace.contrib.pytest_bdd.__init__.py:

The ddtrace.pytest_bdd plugin for pytest-bdd has the side effect of importing the ddtrace package and starting a global tracer.

I'm now also running with these pyproject.toml settings but I still see the error log and the traceback and now the error log and the traceback is finally GONE! šŸŽ‰ :

[tool.pytest.ini_options]
testpaths = "tests"
env_files = ["env/local_test.env"]
addopts = "-p no:ddtrace -p no:ddtrace.pytest_bdd"
fredrikaverpil commented 1 year ago

@Yun-Kim so it seems by adding both options -p no:ddtrace -p no:ddtrace.pytest_bdd solved the issue.

Additional info, not sure if it is helpful:

I've put some logging in the asbool function and in the tracer's __init__ method:

2022-09-15 22:40:51.401 | WARNING  | ddtrace.internal.utils.formats:asbool:73 - in asbool returns False based on: value='false'
2022-09-15 22:40:51.405 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.405 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.405 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.405 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.405 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.405 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.405 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.422 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.422 | WARNING  | ddtrace.internal.utils.formats:asbool:73 - in asbool returns False based on: value='false'
2022-09-15 22:40:51.422 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns True based on: value=True
2022-09-15 22:40:51.422 | WARNING  | ddtrace.tracer:__init__:197 - self.enabled=True
2022-09-15 22:40:51.423 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.423 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.423 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:40:51.423 | WARNING  | ddtrace.tracer:__init__:242 - self.enabled=True

If I then use the pytest opts -p no:ddtrace -p no:ddtrace.pytest_bdd, I see no log output anymore.

And finally, if I keep the pytest opts above enabled, and import ddtrace in my code and also add a @tracer.wrap, I see how these logs appear again:

2022-09-15 22:42:28.215 | WARNING  | ddtrace.internal.utils.formats:asbool:73 - in asbool returns False based on: value='false'
2022-09-15 22:42:28.221 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.221 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.221 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.221 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.221 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.221 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.221 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.247 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.247 | WARNING  | ddtrace.internal.utils.formats:asbool:73 - in asbool returns False based on: value='false'
2022-09-15 22:42:28.247 | WARNING  | ddtrace.internal.utils.formats:asbool:73 - in asbool returns False based on: value='false'
2022-09-15 22:42:28.247 | WARNING  | ddtrace.tracer:__init__:197 - self.enabled=False
2022-09-15 22:42:28.247 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.248 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.248 | WARNING  | ddtrace.internal.utils.formats:asbool:69 - in asbool returns False based on: value=False
2022-09-15 22:42:28.248 | WARNING  | ddtrace.tracer:__init__:242 - self.enabled=False
Yun-Kim commented 1 year ago

@fredrikaverpil fantastic! Happy to hear you were able to resolve the issue šŸ„³

fredrikaverpil commented 1 year ago

@Yun-Kim yes, finally šŸ˜…

However, don't you think there's a problem here with ddtrace silently logging a traceback (the raised exception's exc_info) about not being able to connect when you run pytest and have a fixture using code wrapped with @tracer.wrap?

I mean, it seems to me it should've been enough to set DD_TRACE_ENABLED=false but in this scenario I also need to add the pytest options -p no:ddtrace -p no:ddtrace.pytest_bdd, which wasn't clear to me at all.

Yun-Kim commented 1 year ago

You're right, it is problematic that our pytest plugins don't check the value of DD_TRACE_ENABLED. I'm reopening this issue just to track this problem and for now, you can keep using the workaround -p no:ddtrace -p no:ddtrace.pytest_bdd and we'll work on getting a fix out. Thanks!

Yun-Kim commented 1 year ago

Hi @fredrikaverpil, unfortunately we still weren't able to reproduce your issue on our end. The proposed fix #4204 doesn't directly address the underlying issue so we can't merge it until we figure out why we're seeing the pytest plugin disregards the value of DD_TRACE_ENABLED. Could you hop on to the Datadog Public Slack and/or open a support ticket (https://www.datadoghq.com/support/) with us so that we can have a deeper dive into this issue? Thanks!

fredrikaverpil commented 1 year ago

@Yun-Kim hi, sorry for the late reply. I'll drop by in Slack and maybe we can schedule a session or something.

fredrikaverpil commented 1 year ago

I'm opening a support case as this re-emerged for us.

I'll first post a reproducible example below šŸ‘‡

fredrikaverpil commented 1 year ago

@Yun-Kim so I think we've had some issues where we have been using dotenv file solutions to set DD_TRACE_ENABLED=false. I believe that the variable may then have been set when ddtrace already enabled the tracer (or something like this). So I think we now have finally gotten to the bottom of this problem in CI šŸ¤ž

However, the problem remains for local development, as the pytest options -p no:ddtrace -p no:ddtrace.pytest_bdd does not work. I am almost certain this used to work at some point. But here I have posted a reproducible example of where these pytest options do not disable the tracer: https://github.com/fredrikaverpil/ddcrash

It would be great to have a way to tell pytest not to enable tracing (preferably as a commandline option), without requiring the environment variable DD_TRACE_ENABLED set in the environment.

fredrikaverpil commented 1 year ago

You may wonder why my test is creating a custom span. Well, the reason is that this is what the source code being tested is doing. For e.g. kubernetes cronjobs, we don't have incoming traces but we would still like to have errors reported into APM. We also do this for unexpected crashes via sys.excepthook.

firezdog commented 1 year ago

I also am observing this behavior, although I'm not sure if it goes away when setting the flag or not -- but what's odd is that we use pytest-socket and the test does not fail even though all socket calls should be blocked and cause failures.

turya commented 1 year ago

I'm observing this as well. pytest always hangs if ddtrace is installed, it always installs the plugin, none of the suggestions above disable it. As suggested above, the plugin should at the very least respect the DD_TRACE_ENABLED=false env variable.

turya commented 1 year ago

In fact, I strongly believe that the default behavior should be for it to be off unless it's explicitly enabled.

turya commented 1 year ago

If it helps at all, ddtrace is version 1.6.3

root@f60ede01fe14:/app# env | grep DD
DD_PROFILING_ENABLED=false
DD_TRACE_ENABLED=false
root@f60ede01fe14:/app# pytest -p no:ddtrace -p no:ddtrace.pytest_bdd
================================================================ test session starts ================================================================
platform linux -- Python 3.9.16, pytest-7.2.0, pluggy-1.0.0
rootdir: /app, configfile: .pytest.ini
collected 17 items      

one of two things happens here

========================================================== 17 passed, 9 warnings in 1.73s ===========================================================
Unable to export profile: ddtrace.profiling.exporter.http.UploadFailed: [Errno 99] Cannot assign requested address. Ignoring.

or, hangs and ctrl-c

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/ddtrace/profiling/profiler.py", line 88, in stop
    self._profiler.stop(flush)
  File "/usr/local/lib/python3.9/site-packages/ddtrace/internal/service.py", line 85, in stop
    self._stop_service(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/ddtrace/profiling/profiler.py", line 293, in _stop_service
    self._scheduler.flush()
  File "/usr/local/lib/python3.9/site-packages/ddtrace/profiling/scheduler.py", line 53, in flush
    exp.export(events, start, self._last_export)
  File "/usr/local/lib/python3.9/site-packages/ddtrace/profiling/exporter/http.py", line 217, in export
    self._upload(client, self.endpoint_path, body, headers)
  File "/usr/local/lib/python3.9/site-packages/ddtrace/profiling/exporter/http.py", line 222, in _upload
    self._retry_upload(self._upload_once, client, path, body, headers)
  File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 416, in __call__
    self.sleep(do)
  File "/usr/local/lib/python3.9/site-packages/tenacity/nap.py", line 31, in sleep
    time.sleep(seconds)
KeyboardInterrupt
turya commented 1 year ago

fwiw, nosetests also hangs after running trying to send it's profile, even though the env variables say it should not, so maybe not just a pytest thing. Maybe don't automatically turn on your tooling by default.

emmettbutler commented 7 months ago

Manually doing what the Github Action was supposed to do (fixed in https://github.com/DataDog/dd-trace-py/pull/7835):

This issue has been automatically closed after six months of inactivity. If it's a feature request, it has been added to the maintainers' internal backlog and will be included in an upcoming round of feature prioritization. Please comment or reopen if you think this issue was closed in error.

followben commented 5 months ago

I also am observing this behavior - in my case the tooling isn't respecting our dotenv under local development. Pretty disappointing it hasn't been addressed after @fredrikaverpil provided a minimal repro. Please re-open.

romainkomorndatadog commented 5 months ago

This isn't exactly a CI Visibility product issue, so I have to relinquish this back to our core folks. @mabdinur , I'm not sure if you want to reopen this or track it elsewhere.