mtrudel / bandit

Bandit is a pure Elixir HTTP server for Plug & WebSock applications
MIT License
1.7k stars 86 forks source link

Add support for running optimized native code for websocket mask #394

Closed alisinabh closed 1 week ago

alisinabh commented 3 months ago

This PR adds optional support for native implementation of some functions for bandit.

Some functions like websocket masking, and UTF8 validation can use more efficient native implementations which can speed up bandit in specific use-cases.

Other options

Instead of relying on bandit_native package, this can also be a compile config + a behaviour like this:

# config.exs
config :bandit, native_module: BanditNative

# native_lib.ex
defmodule Bandit.NativeLib do
  @callback ws_mask(payload :: binary(), mask :: non_neg_integer()) :: binary()
end

Update

Instead of relying on a single bandit_native library, this PR switched to using a configuration based approach using a behaviour for so called PrimitiveOps.

mtrudel commented 1 month ago

What do you need from me here @alisinabh? Is this ready to be merged?

alisinabh commented 1 month ago

It is not ready to be merged at this point.

I think we should discuss which approach we want to take (a bandit_native lib vs a compile time configurable module name) and then go forward. What do you think?

sleipnir commented 1 month ago

It is not ready to be merged at this point.

I think we should discuss which approach we want to take (a bandit_native lib vs a compile time configurable module name) and then go forward. What do you think?

I vote for this to be configurable

alisinabh commented 3 weeks ago

@mtrudel This is now ready for a review.

I have also updated https://github.com/alisinabh/bandit_native if you want to give it a test.

mtrudel commented 2 weeks ago

A couple of things:

alisinabh commented 2 weeks ago

Why is this being configured on HTTP/1 when it's al WebSocket behaviour? Should configuration live on websocket_opts in this case?

Sorry this is my bad. Will move it there.

I wonder if we should be creating a PrimitiveOps behaviour for each protocol (ie: Bandit.Websocket.PrimitiveOps instead of Bandit.PrimitiveOps)? This would mesh well with the per-protocol configuration question above.

I personally prefer one module for now as it makes configuration easier. But no strong opinions from me.

On another note, I was wondering if we should move this configuration outside of protocols config and make it a bandit level configuration like below. I think other than benchmarking reasons, it does not make sense to have two bandit instances running with different primitive ops implementations in the same application. (Not sure if we discussed this before)

config :bandit, primitive_ops_module: MyCustomModule
# or
config :bandit, websoket_primitive_ops_module: MyCustomModule
mtrudel commented 2 weeks ago

I think I prefer having a module per protocol, as it makes what you're working on clearer. If they happen to resolve to identical native modules that's fine, but the mapping from Bandit-land to native-land should probably be done on a per-protocol basis. I don't hold that too strongly though.

In either case, we should mark the config points as experimental and subject to change.

alisinabh commented 2 weeks ago

@mtrudel Agreed. I have updated this PR accordingly.

We now have Bandit.PrimitiveOps.Websocket which defines the bahaviour and the default implementation (let me know if you prefer a separate module for the default implementation e.g. Bandit.PrimitiveOps.Websocket.Default)

mtrudel commented 1 week ago

Looks great!

gmile commented 1 week ago

@alisinabh out of curiosity, do you benchmarking results to share for bandit_native?

alisinabh commented 1 week ago

@gmile I have the following benchmark but I wouldn't say it is a good one.

Operating System: macOS
CPU Information: Apple M1 Max
Number of Available Cores: 10
Available memory: 64 GB
Elixir 1.17.0
Erlang 27.0
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 5 s
time: 7 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 24 s

Name                                    ips        average  deviation         median         99th %
Rust NIF - large (1MB)              21.85 K      0.0458 ms   ±153.03%      0.0442 ms      0.0988 ms
Bandit (Elixir) - large (1MB)        0.63 K        1.60 ms    ±43.85%        1.55 ms        1.73 ms

Comparison: 
Rust NIF - large (1MB)              21.85 K
Bandit (Elixir) - large (1MB)        0.63 K - 34.88x slower +1.55 ms

------------------------------------------------------------------------------------------------------

Name                                       ips        average  deviation         median         99th %
Rust NIF - medium (100KB)              77.62 K       12.88 μs    ±52.33%       12.83 μs       17.83 μs
Bandit (Elixir) - medium (100KB)        6.62 K      151.09 μs     ±4.71%      150.58 μs      164.04 μs

Comparison: 
Rust NIF - medium (100KB)              77.62 K
Bandit (Elixir) - medium (100KB)        6.62 K - 11.73x slower +138.21 μs

------------------------------------------------------------------------------------------------------

Name                                    ips        average  deviation         median         99th %
Rust NIF - small (5KB)              98.31 K       10.17 μs    ±62.86%       10.13 μs       14.42 μs
Bandit (Elixir) - small (5KB)       58.06 K       17.22 μs    ±15.64%       17.13 μs       22.58 μs

Comparison: 
Rust NIF - small (5KB)              98.31 K
Bandit (Elixir) - small (5KB)       58.06 K - 1.69x slower +7.05 μs

------------------------------------------------------------------------------------------------------

Name                                  ips        average  deviation         median         99th %
Rust NIF - xs (500B)             100.87 K        9.91 μs    ±62.40%        9.83 μs       13.92 μs
Bandit (Elixir) - xs (500B)       95.08 K       10.52 μs    ±21.80%       10.50 μs       14.58 μs

Comparison: 
Rust NIF - xs (500B)             100.87 K
Bandit (Elixir) - xs (500B)       95.08 K - 1.06x slower +0.60 μs