ioloop_util.submit & CompositeReporter.close both call Future.set_result/set_exception directly and this can lead to cryptic hangs when await-ing their results across threads. using call_soon_threadsafe when the target Future belongs to a different event loop ensures it is only updated from the context of its associated loop & its completion is properly detected.
assigning to Reporter._stopped in _flush() instead of close() avoids dropping those spans passed to report_span just prior to Reporter.close() for which _report_span_from_ioloop will execute only afterwards; this way those scheduled callbacks will get a chance to run before _flush stops the reporter and those pending spans make it into the queue.
this PR avoids problematic direct Future.set_result/set_exception calls from the jaeger-client background tornado IO loop on Futures belonging to a wholly separate event loop as Futures are not threadsafe.
it also stops Reporter.close() from dropping spans whose background _report_span_from_ioloop callbacks are still pending execution.
NOTE: this currently relies on Tornado 5+ & Python 3.7+ as it makes use of tornado.concurrent.Future being an alias for asyncio.Future and the existence of Future.get_loop, respectively; I'm investigating a way of keeping this backwards compatible
ioloop_util.submit
&CompositeReporter.close
both callFuture.set_result
/set_exception
directly and this can lead to cryptic hangs whenawait
-ing their results across threads. usingcall_soon_threadsafe
when the targetFuture
belongs to a different event loop ensures it is only updated from the context of its associated loop & its completion is properly detected.assigning to
Reporter._stopped
in_flush()
instead ofclose()
avoids dropping those spans passed toreport_span
just prior toReporter.close()
for which_report_span_from_ioloop
will execute only afterwards; this way those scheduled callbacks will get a chance to run before_flush
stops the reporter and those pending spans make it into the queue.Signed-off-by: obataku 19821199+obataku@users.noreply.github.com
Which problem is this PR solving?
this PR avoids problematic direct
Future.set_result
/set_exception
calls from the jaeger-client background tornado IO loop onFuture
s belonging to a wholly separate event loop asFuture
s are not threadsafe. it also stopsReporter.close()
from dropping spans whose background_report_span_from_ioloop
callbacks are still pending execution.NOTE: this currently relies on Tornado 5+ & Python 3.7+ as it makes use of
tornado.concurrent.Future
being an alias forasyncio.Future
and the existence ofFuture.get_loop
, respectively; I'm investigating a way of keeping this backwards compatible