Performance test for integrated span collection

codefromthecrypt commented 8 years ago

In the old repository, we mentioned we needed some work to facilitate integration benchmarking. For example, we need to be able to invoke reporters on-demand, regardless of whether that code lives here or in brave.

While many things need benchmarking, one can reasonably argue that span collection is the more critical. For example, there are far more applications reporting data then end-users of zipkin's UI or api. By benchmarking collection, we can help identify bugs, or limitations that impact zipkin's ability to perform its most basic function: storing spans.

It is important that this benchmark be something that others can run, as often laptops aren't representative. For example, in higher loads, there are likely multiple collector shards, and each may have different timeouts, thread pools, and heap size configuration than defaults.

We could have a test that produces spans against http, kafka or scribe and somehow knows how to analyze stats or otherwise to see how many actually arrived. For example, it could read the collector metrics until all messages are accepted, then look at the traces query until all spans are processed or timeout. On timeout, it could verify in a storage-specific way how many spans landed. This all is needed because storage operations are async.

Using such a tool, admins can sample or throttling writes to meet the performance characteristics of their integrated zipkin system. For example, they can set the collector sample rate accordingly or use something like zipkin-zookeeper to ensure writes don't exceed the capacity of the system.

The minimum scenarios should be tested: Reusing the same assets we do for benchmarks, vary on span count, spans/message and messages/second. It is important that these spans have unique timestamps and ids, and that the timestamps vary on days. By using the same assets as our benchmarks, we can more consistently test improvements that may be library-specific.

See #1142 #961 #444

codefromthecrypt commented 8 years ago

thinking about jbender for this http://blog.paralleluniverse.co/2016/03/30/http-server-benchmark/ https://github.com/pinterest/jbender

jorgheymans commented 4 years ago

Not sure if the view that span collection is the most performance critical still applies. A laggy and slow UI makes users turn away from zipkin and not want to use it ...

The idea was insightful at the time though, observability was novel territory back then. I doubt though that we can easily provide something that will work for all sites. Most low traffic sites won't need this, and the high traffic sites are most likely well enough staffed that they can take care of this themselves. We would just end up in a whole range of support issues we don't want to get involved with.

Closing this one, if you still feel there's merit pursuing this feel free to reopen.

codefromthecrypt commented 4 years ago

@anuraaga did the bulk of this with testcontainers. I think the general process could be lifted by another into multi-node somehow https://github.com/openzipkin/zipkin/blob/master/benchmarks/src/test/java/zipkin2/server/ServerIntegratedBenchmark.java

openzipkin / zipkin

Performance test for integrated span collection #1148