Terkwood / AugustDB

Key/value store backed by LSM Tree architecture.
MIT License
8 stars 1 forks source link

Memtable in rust NIF #96

Closed Terkwood closed 3 years ago

Terkwood commented 3 years ago

Problem statement

Calling the Agent in front of the Memtable is slow. In fact, querying the :gb_tree isn't the slow part... it's waiting for the agent process to respond that takes the bulk of the time under load!

Is it possible to somehow hide the memtable behind a rust/NIF facade and speed up access to this large, shared structure?

Also see #97 -- perhaps there's some gain to be made by swapping out Agent for GenServer.

Simple plan

You could just create an elixir agent which exposes read-only state: the only thing it can do is give you a copy of a ResourceArc pointing to the rbtree.

Then use that Arc to mutate a Red Black Tree. Try using intrusive collections: rbtree to represent the memtable. For an easy test implementation, store all keys as strings and values as strings.

Some inspiration: https://github.com/Terkwood/rustler/commit/e944e6ee47fba1ac4a124cc21a93d1030641d9b9

Background

As load increases, the discrepancy gets worse: agent server-side is relatively stable while waiting for the process to respond to the client takes longer and longer amounts of time.

value_controller.ex (client-side) section takes about ~10 units of time

def show(conn, %{"id" => key}) do
    mt_start = :os.system_time()

    case Memtable.query(key) do
      {:value, data, _time} when is_binary(data) ->
        mt_stop = :os.system_time()
        mtime = mt_stop - mt_start
        IO.puts("controller show #{mtime}")
        render(conn, "show.json", %{value: data})

Internal Agent.get only takes ~1 unit of time

case Agent.get(__MODULE__, fn %__MODULE__{current: current, flushing: flushing} ->
           aq_start = :os.system_time()

           case :gb_trees.lookup(key, current) do
             :none ->
               out = :gb_trees.lookup(key, flushing)
               aq_stop = :os.system_time()

               aqt = aq_stop - aq_start
               IO.puts("agent query 🕑 #{aqt}")
               out

             some ->
               aq_stop = :os.system_time()

               aqt = aq_stop - aq_start
               IO.puts("agent query 🐢 #{aqt}")
               some

Tidbits on the web

Terkwood commented 3 years ago

https://hansihe.com/posts/rustler-safe-erlang-elixir-nifs-in-rust/

Terkwood commented 3 years ago

A better approach would be to use NIFs with resource objects. Resource objects allow you to associate a piece of native data with an opaque Erlang term. You can create a Rust struct containing a binary buffer (or any thing else), put that in a resource object, and use that as an argument to subsequent NIF calls.

This approach has many advantages, which includes very fast mutation and random access to the buffer, with no need to leave the current process.

Terkwood commented 3 years ago

Decent setup info here: https://www.kabisa.nl/tech/when-elixirs-performance-becomes-rust-y/

Terkwood commented 3 years ago

And read the manual's example: https://github.com/rusterlium/NifIo/blob/master/native/io/src/lib.rs

Terkwood commented 3 years ago

https://docs.rs/rustler/0.22.0/rustler/resource/struct.ResourceArc.html

Terkwood commented 3 years ago

https://github.com/rusterlium/rustler/blob/e5d7e3d1e8be32f31a7e8ae542c2106474405613/rustler_tests/native/rustler_test/src/test_resource.rs

Terkwood commented 3 years ago

Some wip on the branch associated with #98

Didn't see much of a speed improvement when passing a ResourceArc around using an Agent

And we saw errors!