open-telemetry / opentelemetry-cpp

The OpenTelemetry C++ Client
https://opentelemetry.io/
Apache License 2.0
890 stars 427 forks source link

JSON Serialization Performance #2541

Open perhapsmaple opened 9 months ago

perhapsmaple commented 9 months ago

I have been exploring the use of different json libraries to optimize the performance of my custom pipeline and exporter to write traces and logs to files. My primary concern was serialization speed, which the OtlpHttpExporter was struggling with while logging extensively. While the nlohmann::json library is intuitive and easy to use, it's not very fast for serializing. This is particularly evident when writing a lot of logs. The switch to the rapidjson library brought a significant 40-50% performance improvement to my logging system.

In order to benchmark the serialization impact, I modified the OtlpHttpClient to immediately return ExportResult::kSuccess after converting the proto message to json and dumping it to a string. I then modified example_otlp_http with the following:

...
constexpr uint64_t kMaxIterations = 1000000;
...

InitTracer();

auto start = std::chrono::high_resolution_clock::now();

for (uint64_t i = 0; i < kMaxIterations; i++)
  foo_library();

auto stop = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(stop - start);
std::cout << "BM_JSON: " << duration.count() << " milliseconds" << std::endl;

CleanupTracer();

Result for nlohmann::json implementation: 127145 milliseconds Result for rapidjson implementation : 82763 milliseconds

The code is available at: https://github.com/perhapsmaple/opentelemetry-cpp/tree/json-benchmark Not final - I think some more changes could be made to make it a little bit more efficient

I think this is an easy avenue for improvement, and we should consider benchmarking more thoroughly with both libraries. Happy to hear your thoughts and feedback.

github-actions[bot] commented 9 months ago

This issue is available for anyone to work on. Make sure to reference this issue in your pull request. :sparkles: Thank you for your contribution! :sparkles:

dufferzafar commented 9 months ago

I don't think we'd end up depending on Boost libs, but just for the sake of it, I wanted to mention Boost::JSON as the natural "fast & clean" json library: https://230.jsondocs.prtest.cppalliance.org/libs/json/doc/html/json/benchmarks.html

lalitb commented 9 months ago

One option could be for otel SDK to provide an abstract interface for JSON serialization, with default implementation for nlohmann-json. And let user bring their custom implementation which could internally use rapid JSON or Boost::JSON. This would be similar to how we provide HTTPClientFactory, with default implementation for curl.

github-actions[bot] commented 7 months ago

This issue was marked as stale due to lack of activity.