Expose interface to flush buffered spans

ringerc commented 6 years ago

The tracer buffers closed span, only writing them out intermittently.

Sometimes the client will know it has something to send and want to flush it promptly.

I suggest extending the OpenTracing interface to expose an explicit Flush on jaegertracing::Tracer.

Happy to send a patch if you think it's reasonable.

ringerc commented 6 years ago

Also requires promoting RemoteReporter::flush() to be a virtual method of Reporter.

tedsuo commented 6 years ago

Hi @ringerc, changes to the OpenTracing interface should be made against https://github.com/opentracing/opentracing-cpp instead of jaeger.

But it sounds like the issue is that the jaeger client does not flush spans on close (which it should), not that the flush method isn't directly exposed to 3rd party instrumentation.

ringerc commented 6 years ago

That's correct. However, in the course of working on that I noticed that there were places in my app where I'd really like to flush trace information eagerly. For example, at the completion of some top-level request handler where I know there's work to send, and won't be work afterwards.

Ideally I'd not have timers on every backend ticking away anyway; that's going to get expensive in a multi-process server like postgresql. Being able to explicitly flush so I can do it from postgres's own event handling would be a big help. Hence this open issue.

rnburn commented 6 years ago

@ringerc -- that strategy sounds similar to how envoy does tracing https://github.com/envoyproxy/envoy/blob/master/source/common/tracing/lightstep_tracer_impl.cc#L106 I made an interface for lightstep's tracer that allows it you used in that manner (no threading, or sockets managed by the tracer); but didn't pursue standardizing the interface as I think it would be harder to make a vendor-neutral API from it.

Would be interested in getting some performance data to see what the cost of tracer's doing their own threading and timers is.

isaachier commented 6 years ago

Interesting idea @ringerc, maybe I will reopen #54 after all...

ringerc commented 6 years ago

The reason I care somewhat about the threading and timers is that I'm working on a multi-processing shared-nothing-by-default system where everything is fork()ed.

So there are lots of procs in a service, each with their own tracer instance.

Those timers can add up. As can the threads.

It's not a big concern at this point, and if it becomes one I'll submit a separate patch that lets Jaeger cpp-client entirely disable automatic flushing and use only the main-thread. Shouldn't be too hard (famous last words).

The reason I want forced flushing now isn't so much overhead management as to make sure that spans are sent in a timely manner, especially if the proc may shortly exit. So while I have future performance in mind, it's not the immediate concern.

rnburn commented 6 years ago

Does postgresql use something like epoll or equivalent for its event handling like nginx and envoy do? I don't think it would work well for it manage the tracer's flushing without it since it would cause the main thread to block whenever spans get sent over the network.

isaachier commented 6 years ago

In general I agree about the threading. I have written applications that have run out of threads (old machine and a lot of threads). The C++ client was heavily based on Go, except std::thread isn't as efficient as a goroutine. I wonder what you would suggest in general for C++ thread multiplexing. There are some interesting approaches here, including http://libdill.org/.

isaachier commented 6 years ago

Also, I'd love to use eventfd in the new C client I've been toying with, but it seems to only exist on Linux. @rnburn, how does epoll compare in terms of performance to threading?

rnburn commented 6 years ago

I didn't do any benchmarking; but I expect there must be a reason why nginx, envoy, varnish all make use of it or it's equivalents.

But I'm not sure you'd want to be calling something like eventfd directly. Those applications all use higher level interfaces so that they can work on OSs that provide different functions (for example, envoy uses libevent).

The C++ client was heavily based on Go, except std::thread isn't as efficient as a goroutine.

Wouldn't say this is true at all. If you're misusing std::thread to continually create/join threads to do computations rather than using a thread pool that might be correct. But the tracer should only be creating a single thread once on construction.

isaachier commented 6 years ago

I wonder if the other jaeger clients use only one thread. I can tell you 100% that is not the case now in the C++ client. I could use a single dispatch thread for various async operations, but then the operations could block each other.

yurishkuro commented 6 years ago

I wonder if the other jaeger clients use only one thread.

it depends on the language.

isaachier commented 6 years ago

@rnburn, I'm not writing a library like nginx where IO throughput is extreme. Jaeger client throughput on a single client is pretty modest. Also, my guess is that projects like envoy use libevent for portability as well as efficiency. From a POSIX-only point of view, I find adding the libevent dependency just bloating a potentially lean client. Windows support is nice but not critical. OTOH, I don't want an OS API specific to Linux such as eventfd. So I've found the POSIX timer_create function, and I'll investigate that.

ringerc commented 6 years ago

@rnburn PostgreSQL uses epoll where available, falling back to select() or poll() - I don't remember which.

It's a shared-nothing-by-default shmem-based multi-processing server that uses fork() without exec() for its workers, like the apache2 prefork MPM does. So each backend is usually only handling one connection. epoll() is mainly used to reduce the cost of re-creating the wait event sets for every wait.

So PostgreSQL really doesn't care about the cost of writing some UDP trace data to a local socket that'll probably just get buffered and written out asynchronously by the kernel anyway. A remote writer that sends traces over HTTP to some remote socket would be more of a problem, though.

So I don't have a problem with using threads. I just want to wake them up myself when I know there's something to send, or during an event loop. And stop waking them on a timer.

Integrating into PostgreSQL's epoll loop is possible, but awkward. It's all abstracted behind a layer that hides whether PostgreSQL uses select(), epoll() or whatever the platform has. And it currently has some annoying deficiencies. The trace client lib would need to expose the file descriptor / handle for the socket and some event loop hooks. Annoying and probably not worth it imo.

"Keep it simple".

Adding a libevent dependency not only increases the footprint, it greatly increases the integration challenge of gluing the agent into other tools. If you're really keen to look into eventfd/libevent/etc, it'd be worth looking at how/if external tools can be integrated easily into the event loops of things like nginx.

In an ideal world I guess it'd support either libevent integration with an app that uses libevent, or its own threads. But that's getting complex. So long as the interface doesn't expose threading specifics etc I'd say stick to threading, play with other things later if/when someone needs it. It's the least intrusive into other apps, especially when dealing with arbitrary remote writers that may not have an async interface at all.

isaachier commented 6 years ago

Thanks @ringerc that helps clarify the potential solutions here. Unfortunately, I haven't found a great way of hiding behind a threading/IO interface without imposing a specific library on the application. I was hoping to use a C core to implement various language bindings. Seeing as node.js exclusively uses async io with libuv, libuv would be the ideal framework, but that imposes on the application.

My new idea is to avoid threading altogether in the C API and exposing the IO interfaces directly (i.e. flush). The implementation in C would be completely separate from any OpenTracing standard, only meant to provide the Jaeger core. Then, the tracing implementations would use the C core to implement the OpenTracing standards. Also, C/C++ applications looking for raw performance could use the C core without any of the convenience functions of the higher level APIs.

Off the top of my head, the only real difference would be the lack of timers/background jobs. Here are the background tasks I think the clients currently perform:

UDP buffering of trace data for submission to Jaeger agent.
Periodic async HTTP request to Jaeger agent to provide sampling strategies (only applies to remote sampler) and potentially baggage restrictions (IIRC remote baggage restrictions are not enabled by default).

ringerc commented 6 years ago

Re thread creation/destruction rates, a threadpool with work items is always an option.

Of course things like threading are where you start to really miss C++.

jaegertracing / jaeger-client-cpp

Expose interface to flush buffered spans #53