Optimize netty integration and default config

Change to default Netty configuration

The main improvement in this PR is a change in the sizing of Netty's EvenLoopGroup. By default, Netty will size the EventLoopGroup at nCPU * 2, and will use this pool for 2 purposes:

Accepting new connections (boss)
Running the pipeline and IO (worker)

The sizing of this pool is extremely conservative and assumes that some blocking IO might occur in Netty's worker threads. It also guards against cases where the underlying transport is blocking. However, since we're delegating all user-defined logic to ZIO's executor, and since the transports we're using are all non-blocking (Epoll, KQueue, NIO), a better configuration is to:

Use a separate EventLoopGroup with nThreads=1 for the boss group. This way, we avoid delays in accepting new connections when the worker threads are busy serving existing connections
Size the worker EventLoopGroup with nThreads=nCPU since there's no blocking code running in the worker threads

With this change, we see a ~10% increase in throughput vs the default configuration

Use `ctx.channel.write` instead of `ctx.write` in ServerInboundHandler

This part is not really well documented in Netty, but their main differences are

ctx.write / ctx.writeAndFlush will walk through the pipeline from the current handler until the head of the pipeline when writing a response
ctx.channel.write / ctx.channel.writeAndFlush will start at the end of the pipeline and walk through all the handlers in the pipeline when writing a response

In our case, short-cutting writes to the pipeline by starting at the current handler doesn't bring any benefit because the ServerInboundHandler is the last one in the pipeline. In addition, as this video on Netty's best practises seems to suggest, when we're writing to the channel from a different thread it's better to use ctx.channel.write.

Other changes

removed the runtime scope from epoll / kqueue native transports. These are generally tiny (~20kb each) and I think it's better for users to have them available by default rather than having to install them manually and having to align netty versions
updated the ./.devcontainer files to install wrk and updated the Java / SBT versions in order to make it quicker to spin up a devcontainer for benchmarking purposes
updated build.sbt to fork processes when using sbt zioHttpExample/runMain example.xx to avoid having issues with restarting servers
added sensible java opts when running examples

zio / zio-http