When using the rq tracing integration, the rq worker emits the following exception in a warning log each time it processes a job with result_ttl=0:
Failed to set text tag 'job.status'
Traceback (most recent call last):
File "/workspace/test/.venv/lib/python3.9/site-packages/ddtrace/span.py", line 385, in set_tag_str
self._meta[key] = ensure_text(value, errors="replace")
File "/workspace/test/.venv/lib/python3.9/site-packages/six.py", line 955, in ensure_text
raise TypeError("not expecting type '%s'" % type(s))
TypeError: not expecting type '<class 'NoneType'>'
(More info after the reproduction steps. TL;DR the job is deleted as part of worker.perform_job when result_ttl=0, so when ddtrace tries to get the status of the job in wrapped_perform_job, it's None, which can't be passed to set_tag_str)
from redis import Redis
from rq import Queue
from myjobs import do_work
q = Queue(connection=Redis())
q.enqueue(do_work)
q.enqueue(do_work, result_ttl=0)
myjobs.py
import time
def do_work():
time.sleep(1)
#############
# since the exception in question is being caught and logged,
# wrap the logger to see where the log came from
import logging
import traceback
factory = logging.getLogRecordFactory()
def wrap(*args, **kwargs):
record = factory(*args, **kwargs)
if record.exc_info:
print("--------------")
print("logged an exception at:")
traceback.print_stack(limit=8)
print("--------------")
return record
logging.setLogRecordFactory(wrap)
start up the redis-server (version doesn't really play a role, but I have 5.0.7)
in the same directory as app.py and myjobs.py, start up ddtrace-run rq worker
run python app.py to send the two jobs to the worker (doesn't make a difference whether app.py is ddtraced or not)
What is the result that you get?
In the terminal running ddtrace-run rq worker:
23:02:12 Worker rq:worker:088aa9cc27674a1b814b0f06e09739c0 started with PID 77438, version 1.15.1
23:02:12 Subscribing to channel rq:pubsub:088aa9cc27674a1b814b0f06e09739c0
23:02:12 *** Listening on default...
failed to send, dropping 13 traces to intake at http://localhost:8126/v0.4/traces after 3 retries
23:02:15 default: myjobs.do_work() (c07877e6-ae37-4a09-8365-fb4c9d7036ae)
23:02:16 default: Job OK (c07877e6-ae37-4a09-8365-fb4c9d7036ae)
23:02:16 Result is kept for 500 seconds
23:02:16 default: myjobs.do_work() (74a20256-15eb-4248-9b45-f3a54aefc795)
23:02:17 default: Job OK (74a20256-15eb-4248-9b45-f3a54aefc795)
23:02:17 Result discarded immediately
--------------
logged an exception at:
File "/workspace/test/.venv/lib/python3.9/site-packages/rq/worker.py", line 1310, in main_work_horse
self.perform_job(job, queue)
File "/workspace/test/.venv/lib/python3.9/site-packages/ddtrace/contrib/trace_utils.py", line 341, in wrapper
return func(mod, pin, wrapped, instance, args, kwargs)
File "/workspace/test/.venv/lib/python3.9/site-packages/ddtrace/contrib/rq/__init__.py", line 197, in traced_perform_job
span.set_tag_str("job.status", job.get_status())
File "/workspace/test/.venv/lib/python3.9/site-packages/ddtrace/span.py", line 389, in set_tag_str
log.warning("Failed to set text tag '%s'", key, exc_info=True)
File "/home/coder/.pyenv/versions/3.9.9/lib/python3.9/logging/__init__.py", line 1458, in warning
self._log(WARNING, msg, args, **kwargs)
File "/home/coder/.pyenv/versions/3.9.9/lib/python3.9/logging/__init__.py", line 1587, in _log
record = self.makeRecord(self.name, level, fn, lno, msg, args,
File "/home/coder/.pyenv/versions/3.9.9/lib/python3.9/logging/__init__.py", line 1556, in makeRecord
rv = _logRecordFactory(name, level, fn, lno, msg, args, exc_info, func,
File "/workspace/test/./myjobs.py", line 19, in wrap
traceback.print_stack(limit=8)
--------------
Failed to set text tag 'job.status'
Traceback (most recent call last):
File "/workspace/test/.venv/lib/python3.9/site-packages/ddtrace/span.py", line 385, in set_tag_str
self._meta[key] = ensure_text(value, errors="replace")
File "/workspace/test/.venv/lib/python3.9/site-packages/six.py", line 955, in ensure_text
raise TypeError("not expecting type '%s'" % type(s))
TypeError: not expecting type '<class 'NoneType'>'
The first job does not cause anything unusual. However for the second job with result_ttl=0, a warning exception log Failed to set text tag 'job.status' is emitted. The log was logged from set_tag_str which was called by span.set_tag_str("job.status", job.get_status()) inside traced_perform_job.
You can see it say the job result is "discarded immediately". It seems like when result_ttl=0 (i.e. don't need any results of a job, we just want its side effects), rq just entirely skips updating the status of a successful job (never sets it to "finished") and immediately deletes the job.
So by the time ddtrace finishes the wrapped perform_job and gets to the finally block, the job has already been deleted.
What is the result that you expected?
No exceptions caused (raised/logged).
For example, usage of set_tag rather than set_tag_str, since status might be None? I'm not sure if there's another way to come up with a tag for the status that's more useful than None. For example, if job.get_status(refresh=False) was used in this situation, then the status would just return the old value of just "started" because as mentioned above, the "finished" status never gets set on the job.
Thanks for the detailed bug report. I'll put up a fix for this. When the job status is not available we can just default to "None". I opened a PR to fix this behavior.
Summary of problem
When using the
rq
tracing integration, therq
worker emits the following exception in a warning log each time it processes a job withresult_ttl=0
:(More info after the reproduction steps. TL;DR the job is deleted as part of
worker.perform_job
when result_ttl=0, so when ddtrace tries to get the status of the job inwrapped_perform_job
, it's None, which can't be passed toset_tag_str
)Which version of dd-trace-py are you using?
2.3.3
Which version of pip are you using?
pip 23.3.2
python 3.9.9
Which libraries and their versions are you using?
pip freeze
:How can we reproduce your problem?
Minimal reproduction:
pip install ddtrace rq
touch app.py myjobs.py
app.py
myjobs.py
redis-server
(version doesn't really play a role, but I have 5.0.7)ddtrace-run rq worker
python app.py
to send the two jobs to the worker (doesn't make a difference whether app.py is ddtraced or not)What is the result that you get?
In the terminal running
ddtrace-run rq worker
:The first job does not cause anything unusual. However for the second job with
result_ttl=0
, a warning exception logFailed to set text tag 'job.status'
is emitted. The log was logged fromset_tag_str
which was called byspan.set_tag_str("job.status", job.get_status())
insidetraced_perform_job
.https://github.com/DataDog/dd-trace-py/blob/4577f757cc8ab750d1fdbfae1fb767dcbacf6f22/ddtrace/contrib/rq/__init__.py#L192-L200
https://github.com/DataDog/dd-trace-py/blob/v2.4.0/ddtrace/contrib/rq/__init__.py#L192-L200
You can see it say the job result is "discarded immediately". It seems like when
result_ttl=0
(i.e. don't need any results of a job, we just want its side effects),rq
just entirely skips updating thestatus
of a successful job (never sets it to "finished") and immediately deletes the job.It's not really documented but here is a little bit of related docs: https://python-rq.org/docs/results/#return-value-ttl
You can see this behaviour in the source code of
rq
'sworker.perform_job
, I linked to 1.15.1 but it's been like this for a while,worker.perform_job
https://github.com/rq/rq/blob/v1.15.1/rq/worker.py#L1428-L1440worker.handle_job_success
https://github.com/rq/rq/blob/v1.15.1/rq/worker.py#L1390-L1395job._handle_success
ifresult_ttl
is nonzerojob._handle_success
is where status gets updated to finished https://github.com/rq/rq/blob/v1.15.1/rq/job.py#L1444-L1447job.cleanup
is called andresult_ttl
is passed in.result_ttl
is 0,job.cleanup
immediately deletes the job https://github.com/rq/rq/blob/v1.15.1/rq/job.py#L1359-L1373So by the time ddtrace finishes the wrapped
perform_job
and gets to the finally block, the job has already been deleted.What is the result that you expected?
No exceptions caused (raised/logged).
For example, usage of
set_tag
rather thanset_tag_str
, since status might be None? I'm not sure if there's another way to come up with a tag for the status that's more useful than None. For example, ifjob.get_status(refresh=False)
was used in this situation, then the status would just return the old value of just "started" because as mentioned above, the "finished" status never gets set on the job.