Closed OlivierAlbertini closed 2 years ago
Could you make some benchmarks to see how much of a performance issue this actually would be in a production app? My understanding is that most spans would not be sampled anyway, and we only would export to Zipkin a smallish number of spans.
I see bringing in a new library as potentially exposing end users to security risks / bugs if the dependency encounters those at some point vs. the safe and built in (if non-optimal) JSON.stringify
I would be happy to make a benchmark. Do you have some exemples that could inspire me in order to provide to you something with added values or tools/packages that I could use.
I just want to avoid making a clunky benchmark.
Any feedback would be appreciate.
I don't have experience with benchmarking tool, but was thinking to use @clevernature/benchmark-regression
to perform regression tests.
https://www.npmjs.com/package/@clevernature/benchmark-regression
Thanks Mayur, I will give a try
I actually think something you just do as a one-off would be fine - say a simple Node server that processes a hello-world type request, samples at 100% and exports to Zipkin. Run local server and local zipkin and then hammer it with traffic say with Apache bench or similar. See how it performs with the native JSON.stringify
vs. fast-json-stringify
and see the performance change in the overall server.
I'm not necessarily suggesting this is the best thing to do time right now, just that this would be my suggestion to validate that adding the dependency is worth it.
I also do think more general benchmark tests that run on every PR would be good, but I'd think that could be more for core parts of the library (though maybe Zipkin exporter would be worth doing it for as a whole).
Simpler indeed and good timing since I did https://github.com/open-telemetry/opentelemetry-js/pull/318
Using the http example in this repo (without the setTimeout
on the server side):
First ab test is "as is", second is by removing the JSON.stringify
and applying a RAW transformation (in order to avoid a dep) because span is not so complex (Jaeger does this too).
~/apps on 🐳 v19.03.2
➜ ab -n 5000 -c 100 http://localhost:8080/helloworld
This is ApacheBench, Version 2.3 <$Revision: 1826891 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 500 requests
Completed 1000 requests
Completed 1500 requests
Completed 2000 requests
Completed 2500 requests
Completed 3000 requests
Completed 3500 requests
Completed 4000 requests
Completed 4500 requests
Completed 5000 requests
Finished 5000 requests
Server Software:
Server Hostname: localhost
Server Port: 8080
Document Path: /helloworld
Document Length: 12 bytes
Concurrency Level: 100
Time taken for tests: 9.950 seconds
Complete requests: 5000
Failed requests: 0
Total transferred: 435000 bytes
HTML transferred: 60000 bytes
Requests per second: 502.49 [#/sec] (mean)
Time per request: 199.010 [ms] (mean)
Time per request: 1.990 [ms] (mean, across all concurrent requests)
Transfer rate: 42.69 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 3 1.6 3 41
Processing: 32 194 56.1 190 445
Waiting: 14 165 50.9 161 441
Total: 32 197 56.2 192 447
Percentage of the requests served within a certain time (ms)
50% 192
66% 208
75% 225
80% 229
90% 261
95% 286
98% 380
99% 445
100% 447 (longest request)
——————————————————————————————————————
➜ ab -n 5000 -c 100 http://localhost:8080/helloworld
This is ApacheBench, Version 2.3 <$Revision: 1826891 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 500 requests
Completed 1000 requests
Completed 1500 requests
Completed 2000 requests
Completed 2500 requests
Completed 3000 requests
Completed 3500 requests
Completed 4000 requests
Completed 4500 requests
Completed 5000 requests
Finished 5000 requests
Server Software:
Server Hostname: localhost
Server Port: 8080
Document Path: /helloworld
Document Length: 12 bytes
Concurrency Level: 100
Time taken for tests: 9.413 seconds
Complete requests: 5000
Failed requests: 0
Total transferred: 435000 bytes
HTML transferred: 60000 bytes
Requests per second: 531.19 [#/sec] (mean)
Time per request: 188.258 [ms] (mean)
Time per request: 1.883 [ms] (mean, across all concurrent requests)
Transfer rate: 45.13 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 2 1.8 2 17
Processing: 7 185 47.1 179 347
Waiting: 2 151 42.1 149 278
Total: 7 187 46.9 181 347
Percentage of the requests served within a certain time (ms)
50% 181
66% 201
75% 217
80% 229
90% 254
95% 280
98% 298
99% 305
100% 347 (longest request)
Should we let the JSON.stringify
?
This is great 💯. Could you please use fast-json-stringify
and run the same test?
Looks like you are saving ~10ms at the 50p latency, and ~140ms at the 99p latency.
I'm interested in what you said about "applied a RAW transformation (in order to avoid a dep)" - might that be a way to have better performance without introducing a dependency?
Do you know if fast-json-stringify
supports a mode where you only include it as a dev dependency and not an install dependency? If you could have fast-json-stringify
generate the function for you and then check in that function, I wonder if that could also be a way to avoid the extra install dependency but still get the performance benefit.
fast-json-stringify generate the function for you and then check in that function
fast-json-stringify
is pretty small. The gain would be minimal.
On the list of things to do, this is at best very far down, and at worst more trouble than it could ever be worth.
My thinking with inlining the resulting function from fast-json-stringify
rather than depending on the library is based on a security consideration - each new dependency you add, no matter how small, provides an opportunity for the author of that package to be compromised and push a vulnerable version of the package. In principle, could audit every single package dependency all the way down the tree when you do upgrades ..... but that's not super realistic.
So in general, I think we should have a fairly high bar to taking on a new install dependency. Dev dependencies concern me less because a vulnerability there would only affect developers of the library, not end customers and their production apps.
Seems like consensus was reached to keep using JSON.stringify? So this can be closed @dyladan
Using alternative JSON.stringify
is tricky on multiple measures. Especially across different environments.
Current behaviour
Zipkin exporter use
JSON.stringify
when exporting spans. https://github.com/open-telemetry/opentelemetry-js/blob/b58965c97e5be0b556d83d0000eff0e3c7e25209/packages/opentelemetry-exporter-zipkin/src/zipkin.ts#L165Desired behaviour