evanw / esbuild

An extremely fast bundler for the web
https://esbuild.github.io/
MIT License
38.04k stars 1.14k forks source link

Using native bindings #248

Closed wilsonzlin closed 2 years ago

wilsonzlin commented 4 years ago

Hello :wave:

I've managed to create a PoC Node.js module with native bindings to esbuild's Transform using cgo and Neon. Some preliminary benchmarks seem to indicate about 2X the speed compared to the current esbuild package, which appears to simply communicate over processes. The code is here; I ran it on JS libraries from 100 KB to 7 MB and used benchmark.

I'd be happy to explore integrating and testing something similar to the Node.js esbuild module if interested. I did notice some related old issues (#194 and #219), but I think this is possible and straightforward without breaking compatibility.

I came across this after attempting to embed esbuild within a Rust HTML minifier I wrote, which uses native bindings to run on Python, Ruby, Java, and Node.js across platforms.

evanw commented 4 years ago

Thanks for reaching out! I really appreciate it.

Just want to say up front that I'm not sure if I want to take this project in that direction or not. I was actually super into Rust in the past (indeed, esbuild used to be written in Rust) but I've more recently become spoiled by certain parts of Go such as pleasant compile times and how trivial makes portable cross-platform deployments. I do care a lot about run-time speed for esbuild but it's not the only concern that I use to guide the implementation decisions, especially since esbuild is already so fast.

With the current architecture the stdio-based protocol is highly portable, can be made to work with any version of node, and will be easy to port to other languages. Adding support for more platforms is also super easy (just go build for that platform) and I can be very confident they will work on other users machines without much testing because of how Go's static linking works. It's also super fast (compiling esbuild for all 8 platforms I support takes ~1s) whereas I've commonly experienced ~5min compile times per platform for my previous Rust projects. I wouldn't want a release for esbuild to take a long time, especially if I need to debug something in a release build.

I'm also wondering about the increase in technical complexity from a project perspective. I don't have a well-formed opinion about that part yet, but I'm aware that each new language and toolchain added comes with cost and overhead. For example, doing this would mean someone would have to be proficient in JavaScript, Go, and Rust to work on esbuild, as well as having to understand the nuances of bindings across all three environments.

That said, I'm technically interested in this direction because I haven't done native bindings with node before. And I'd like to be able to experience the speedup myself and do some profiling to compare it for some use cases I have in mind. I tried running the code you linked to but it didn't compile (macOS 10.14.6). Any ideas? I have the same error that is causing GitHub's CI to fail:

[esbuild-rs 0.0.5] ./esbuild.go:7:51: could not determine kind of name for C.byte
...
error: could not find native static library `esbuild`, perhaps an -L flag is missing?
wilsonzlin commented 4 years ago

Thanks for the reply :smile:

Yes agree fully, speed is not everything, and dev experience is important too :100:. Also, apologies on the broken build, I've fixed up the Go file. I ran a test on macOS, the bench result was almost 3x, even better than expected :laughing:. Just my 2c on the great points you raised:

Regarding cross platform support, I'm not fully aware of the Rust technical details but it seems to have decent support as well, with macOS/Windows/Linux ARM/x86, and it's possible to have cross-compile targets. I believe Rust and Go have similar goals of creating static self-contained reliable cross-platform binaries. FFI is also highly portable and is essentially just C.

For compile times, the main Rust file would just be some minor boilerplate FFI wrapping, and the Go library will still be compiled with Go and only need to be linked, which appears to be pretty fast (about 5 - 10 seconds on my machine, although right now I'm inefficiently fetching the Go module every time). Also, development would still be done in Go, and the Rust bindings would only be built for testing/releasing for Node.js specifically.

I think complexity wise it's reasonable, as basically the Rust FFI wrapper is two tiny functions (one to talk to Go, one to talk to Node.js), self-contained in one file. There might not even need to be any JS, if the library exports are simple enough :smile:. Similar to the previous point, the development would still remain fully within Go. Basically it's replacing the wrapper index.js that pipes to a process, and not touching the main project.

Let me know how it profiles! I have not done broad tests yet, and am also interested in how well it performs as I'm using it from Rust.

chowey commented 4 years ago

Would it not be possible to bind directly from Go to Node.js using Cgo?

I'm not speaking from any experience with using Node's C-bindings. But in theory at least, that's all Neon is doing. It is using Node's C-bindings.

I would expect that Cgo-Node.js would be faster than Cgo-Rust-Node.js. But probably not as easy.

wilsonzlin commented 4 years ago

Yes it should be possible. However, IMHO it's probably safer and easier to use Rust, and Neon is a thin wrapper around the native Node.js API, so it shouldn't have much overhead (if any).

Rust would provide guaranteed memory safety with automatic memory management, and safe and friendly zero-cost libraries (like trimming and copying C strings), without worrying about cross-platform compatibility. Replacing Rust with C/C++ potentially requires more infra/plumbing work and safety testing.

evanw commented 4 years ago

Just gave it a spin.

The first thing I noticed is that you're comparing against esbuild.transformSync(). This function creates a new child process for every call. It's only there for convenience and is not intended to be high-performance. For best performance you'll want to compare against the service-oriented API which is asynchronous and parallelized.

One of the benchmarks I've been using is to transform a non-trivial JavaScript file 1000x. This is a realistic use case. For example, Snowpack and Vite both use esbuild to do on-the-fly compilation of individual TypeScript modules in your project as they are loaded over the network by the browser's native ES module loader.

Here's my benchmark code:

let fs = require('fs')
let code = fs.readFileSync(__dirname + '/node_modules/esbuild-wasm/lib/main.js', 'utf8')

async function benchmark(name, esbuild) {
  let service = await esbuild.startService()

  // Warmup
  for (let i = 0; i < 10; i++) await service.transform(code, { minify: true })
  await new Promise(r => setTimeout(r, 250))

  // Transform 1000x in parallel
  let start = Date.now()
  let promises = []
  for (let i = 0; i < 1000; i++)
    promises.push(service.transform(code, { minify: true }))
  await Promise.all(promises)
  let time = Date.now() - start

  console.log(`${name}: ${time}ms`)
  service.stop()
  return time
}

function esbuild_rs() {
  let native = require('./esbuild.node');
  return {
    startService: async () => ({
      transform: async (code) => native.minify(Buffer.from(code)).toString(),
      stop: () => {},
    })
  }
}

async function main() {
  let a = await benchmark('esbuild', require('esbuild'))
  let b = await benchmark('esbuild-rs', esbuild_rs())
  console.log(a < b ? `${(b / a).toFixed(1)}x slower` : `${(a / b).toFixed(1)}x faster`)
}

main().catch(e => setTimeout(() => { throw e }))

To run it you'll need to install esbuild and copy the esbuild.node file from your project into the directory. Here's what I get when I run it:

esbuild: 455ms
esbuild-rs: 1229ms
2.7x slower

Using your library is 2.7x slower than using esbuild itself in this particular benchmark. This isn't an accurate comparison of course because it's comparing parallel with serial, so it doesn't mean much.

If anything it's promising because it means the parallelized version of your library should end up being faster than esbuild itself. Single-threaded esbuild is around 4x slower than multi-threaded esbuild on my six-core machine so I'd expect a similar 4x speedup on your library. A 4x speedup to your library should put you at around 1.5x faster than esbuild.

What would it take to get your library to be parallel? Is that something that neon supports? How does JavaScript's event loop interact with Go's scheduler?

wilsonzlin commented 4 years ago

So this is interesting.

I started off by simply using the built-in threading API for Node.js, but this wasn't able to sufficiently saturate the CPU like Go does. I think this is because Go is much more effective at concurrency, whereas the Node.js threading is simply calling the function at most CPU_CORES at once and blocking until Go returns.

It was clear that we should leverage the Go scheduler. Initially I was thinking to somehow integrate directly into Go internals, but this was too complex. I ended up simply using a callback mechanism: the Node.js native code would call a Go function, which takes a function pointer, starts a goroutine of the actual function with go, and returns immediately. When that's done, the goroutine will call the callback with the results.

You can see the code on the goevent branch. I switched over to C as using raw pointers became too verbose in Rust. Running the benchmark, it's able to fully saturate the CPU like the normal executable; the performance boost seems to vary based on file sizes. On the smaller end (e.g. React.js, 100 KB), it's around 25%, on larger files (e.g. Plotly.js, 7 MB), it's around 500%.

I will look into the performance a bit deeper; there are a few hot code parts which could be improved. I am slightly surprised at the outcome disparity though, and would've guessed an opposite outcome, where larger files have smaller perf differences due to the processing itself taking up the most time.

evanw commented 2 years ago

I'm going to close this since I want don't want esbuild's build process to require cgo. That's a build system complexity threshold that I don't want to cross. One of my reasons for choosing Go in the first place is that it can do fast and simple cross-platform builds.