mhdawson commented 9 years ago

This issues is to discuss/identify candidate benchmarks. So far what we have on the list is:

Acme-Air for Node https://github.com/acmeair/acmeair-nodejs
ghost blog (https://github.com/tryghost/Ghost) and create our own load driver
techEmpower (https://www.techempower.com/benchmarks/)
Existing micro benchmarks

We expect that we'll want multiple, with at least one to cover each use case identified in https://github.com/nodejs/benchmarking/issues/5

meaku commented 9 years ago

Concerning benchmarking "real" apps it might be hard to get meaningful results that can be compared. If you're benchmarking an app there are many modules involved and external resources like database.

Think of benchmarking acmeair v0.1.1 using io.js 2.2.1 on day 1. A week later we benchmark acmeair v 0.1.1 using iojs 2.2.2 and the benchmark shows that it's way faster.

But we actually don't know why? Might be an updated module, or a database upgrade or io.js being faster.

I think benchmarking a real application is a good thing, but there should be a clear strategy on how to get reproducible results. Maybe shrinkwrapping modules versions and mocking databases could do the trick.

What do you think?

seabaylea commented 9 years ago

Agreed - for any benchmark it will be important to only change one "component" (be that node.js/io.js or used module) at a time, so we know where the change in performance is coming from. Shrinkwrapping is a good approach to making sure that happens.

mhdawson commented 9 years ago

+1 for seabaylea's comment. For comparison purposes we'll want to limit the changes so that we can isolate what may have affected results

meaku commented 9 years ago

Do you think database latency might be an issue? If we stick to the same machine(s) and database version it might not be a problem.

mhdawson commented 9 years ago

Like other variables we'd need to make sure we keep the database version/systems consistent between runs and when we do change the database not change anything else at the same time

mhdawson commented 8 years ago

Since WebSockets are often used with Node some that we might consider including:

https://www.npmjs.com/package/thor https://www.npmjs.com/package/websocket-benchmark https://www.npmjs.com/package/websocket-bench

trevnorris commented 8 years ago

Here are a couple gulp scripts that would put decent pressure on a macro-benchmark: https://github.com/gulpjs/gulp/issues/1118

davisjam commented 5 years ago

docs/case_coverage.md is somewhat sparse.

I am interested in helping to populate it.

I'd like use this comment to track recommended benchmarks for the various use cases. If you have a suggestion, please reply on this issue and I'll update this comment. The current use cases below are taken from #243.

Use Cases

Node.js a component in a web stack

Use case	Suggested benchmark(s)
Back-end API services	- ezPAARSE? See #76 though. - ?
Service oriented architectures (SOA)	?
Microservice-based applications	- Node-DC-EIS in u-service mode, but see #78. - jasnell and mcollina suggested workloads that are (a) JSON parse/stringify heavy, or (b) use FS and DNS heavily
Generating/serving dynamic web page content	- Acme Air - Node-DC-EIS (monolithic mode) - Node-DC-SSR (electrode) - ghost
Single page applications (SPA)	etherpad-lite

Node.js outside of the web stack

Use case	Suggested benchmark(s)
Scripting and automation	- Micro-benchmark for `require` - Micro-benchmark for node start/stop time
Agents and Data Collectors	Something based on Telegraf?
Developer tooling: web	Web Tooling Benchmark
Developer tooling: Node.js	Run `npm` commands like `npm install` and `npm audit`. Ideally we configure npm to use a local registry to eliminate network interference.
Desktop applications	Electron. Atom.
Systems software	Synthetic workload provided by jorangreef
Embedded software	?

mhdawson commented 5 years ago

https://github.com/nodejs/benchmarking/blob/master/docs/case_coverage.md is not completely empty, but it would be happier if there were less blank spaces (which I'm guessing is what you meant).

davisjam commented 5 years ago

Yes, I realize that was not clear. I've edited my post to read "somewhat sparse".

jorangreef commented 5 years ago

From https://github.com/nodejs/benchmarking/pull/243:

@jorangreef I think you might have some comments on the "systems software" use case and perhaps others?

Firstly, thanks @davisjam and everyone here for your efforts expanding the Node.js benchmarking use cases.

Ronomon is an email startup in private beta. It falls into the "systems software" use case. Our new storage stack is being written in Node.js to drive 16x 10TB disks per server.

Things that are important for this use case:

The system encrypts and authenticates large 64KB+ fixed-size disk sectors, and needs to saturate the sequential write throughput of 16 disks. This requires HMAC and AES-256-CTR throughput > 1.6 GB/s. That rules out Node's synchronous crypto from the start, and makes asynchronous crypto essential (https://github.com/ronomon/crypto-async) to avoid blocking and to achieve multi-core throughput. If a disk or storage node fails and we need to rebuild, we can't afford to have the system bottlenecked on the throughput of a single CPU core doing crypto. The alternative of a cluster or multi-process solution would introduce needless complexity and overhead, and defeat the point of using Node.js in the first place, i.e. single-threaded non-blocking control plane with an asynchronous data plane.
Of course, the storage stack is not just doing crypto, it's also doing fs operations, using the same threadpool. At present, this is causing massive head-of-line blocking in the threadpool, with the much faster crypto tasks getting stuck behind the much slower fs tasks. You can imagine what happens when you race the Dakar Rally and the Monaco Grand Prix on the same track. For benchmarking, this means we need to benchmark the threadpool not just for DNS or FS tasks, but also for CPU-intensive tasks.
In addition to crypto and fs tasks, the storage stack also does erasure coding (https://github.com/ronomon/reed-solomon) and deduplication (https://github.com/ronomon/deduplication) using the threadpool. These are too CPU-intensive to be run synchronously, on the order of tens of milliseconds per task, and again we need multi-core throughput to saturate the disks' write bandwidth.
We use direct IO to raw block devices, for more control over a few things, not least to avoid spiking write commit latency due to large write buffer stalls. From a benchmarking point of view, this means that fs benchmarks should reflect realistic disk performance, instead of measuring only the filesystem cache. This becomes especially important when benchmarking the interaction between fs tasks and CPU tasks.
A single Node.js process for one of the storage servers manages 48-64 GB RAM. As a result, most of Ronomon's data structures are already large flat buffers, e.g. https://github.com/ronomon/hash-table, to reduce GC pause times, but reducing GC pause times under load remains critical to avoid blocking the event loop.
Because of the large memory footprint, simple things like spawning a child process asynchronously using Node.js turned out to be synchronous instead, and led to the event loop blocking for 1-2 seconds per async spawn(). We eventually had to stop using spawn() and switched to a unix socket. More benchmarks for the Node.js api for large memory footprints would be brilliant.

I hope this helps, Node.js has been great so far, making it easy to dip into C when needed, and with Javascript as a fast control plane language. It's fantastic to have a whole benchmarking team, and I'm looking forward to seeing CPU-intensive tasks becoming first-class asynchronous citizens.

davisjam commented 5 years ago

The TechEmpower benchmark source for Node.js is here.

nodejs / benchmarking

Candidate benchmarks #6

Use Cases

Node.js a component in a web stack

Node.js outside of the web stack