boundary / folsom

Expose Erlang Events and Metrics
Apache License 2.0
585 stars 166 forks source link

Performance improvements to spiral, counter, histogram & spiral_uniform #45

Closed Vagabond closed 11 years ago

Vagabond commented 11 years ago

Changes include:

Partition counter and spiral writes by erlang:system_info(scheduler_id) and a bitwise mask. There is also potential for better cache behavior given the fixed mapping between Erlang scheduler thread and partitioned key.

Switch spiral and slide_uniform from ordered_set to set. Set supports fine grained locking whereas ordered_set requires a full-table lock. Combining set and separating values greatly reduces ETS contention.

Change histogram to avoid an ETS insert if the sample passed into the histogram update function matches the result.

There are 2 places in folsom where an ets:insert_new is done immediately followed by an ets:update_counter on the same key. Since, in the normal case, the key is likely to already exist, this can be optimized by trying the update_counter first in a try/catch and only do the insert_new if needed. This is provided as a utility function called folsom_utils:update_counter().

There is a bug in slide_uniform where it would not decrease the probability of doing a write the more updates it received in a particular moment. Effectively slide_uniform updates would always result in a write. This bug has been corrected, along with the Quickcheck test.


Lots of credit here goes to @jtuple, who did a lot of this work, particularly around partitioning writes to avoid contention. He also wrote the benchmarking code below.

It is also important to note that these speedups are more apparent when you have many concurrent processes writing to a folsom stat. There are also improvements in the single-writer case, but they are not as profound.


Microbenchmark results for folsom master vs adt-speedups (time in seconds):

40,000 workers doing 100 writes each:

metric master adt-speedups speedup
histogram (slide_uniform) 40 4 10x
spiral 16 0.8 20x
counter 4 0.3 13x

80,000 workers doing 100 writes each:

metric master adt-speedups speedup
histogram (slide_uniform) 78 7 11x
spiral 33 1.6 20x
counter 8 0.6 13x

Benchmark:

-module(stats).
-compile(export_all).

go() ->
    timer:tc(fun test/0).

test() ->
    folsom:start(),
    Size = 100,
    NumW = 40000,
    %% Size = 1000,
    %% NumW = 10000,
    new_histogram(test),
    %% new_spiral(test),
    %% new_counter(test),
    %% [new_counter({x, N}) || N <- lists:seq(1,1000000)],
    Self = self(),
    [spawn(fun() ->
                   worker(Size),
                   Self ! done
           end) || _ <- lists:seq(1,NumW)],
    wait(0, NumW),
    ok.

wait(Same, Same) ->
    ok;
wait(N, NumW) ->
    receive
        done ->
            wait(N+1, NumW)
    end.

worker(0) ->    
    ok;
worker(N) -> 
    add_histogram(test, 100),
    %% add_spiral(test),
    %% add_counter(test,1),
    worker(N-1).

new_histogram(Name) ->
    {SampleType, SampleArgs} = {slide_uniform, {60, 1028}},
    folsom_metrics:new_histogram(Name, SampleType, SampleArgs).

add_histogram(Name, Value) ->
    folsom_metrics:notify_existing_metric(Name, Value, histogram).

new_spiral(Name) ->
    folsom_metrics:new_spiral(Name).

add_spiral(Name) ->
    folsom_metrics:notify_existing_metric(Name, 1, spiral).

new_counter(Name) ->
    folsom_metrics:new_counter(Name).

add_counter(Name, Value) ->
    folsom_metrics:notify_existing_metric(Name, {inc, Value}, counter).