Shopify / statsd-instrument

A StatsD client for Ruby apps. Provides metaprogramming methods to inject StatsD instrumentation into your code.
http://shopify.github.io/statsd-instrument
MIT License
570 stars 94 forks source link

Datagram builder as a C extension #259

Open methodmissing opened 4 years ago

methodmissing commented 4 years ago

Why?

The datagram builder is heavy on allocation (and reallocation) when coercing both metric names and tags into normalized and valid statsd wire protocol components.

This extension is also implemented as a wrapped struct without any global state as @hkdsun and @bmansoob expressed interest in how to do that.

Points of attack

Configurables

Resiliency issues covered

Wrapped builder struct layout

Fits within 1 cache line (46 of 64 bytes) with the buffer as the last member. 4kb is very generous, but in reality the vast majority of statsd datagrams would only reach into 1 or 2 more cache lines.

The struct is passed around by reference as it's heap allocated anyways and few things are as bad as large structs passed by value.

lourens@CarbonX1:~/src/statsd-instrument/lib/statsd/instrument/ext$ pahole -C datagram_builder ./statsd.so
struct datagram_builder {
    st_table *                 normalized_tags_cache; /*     0     8 */
    st_table *                 normalized_names_cache; /*     8     8 */
    VALUE                      str_normalize_chars;  /*    16     8 */
    VALUE                      str_normalize_replacement; /*    24     8 */
    VALUE                      default_tags;         /*    32     8 */
    _Bool                      empty_default_tags;   /*    40     1 */

    /* XXX 3 bytes hole, try to pack */

    int                        prefix_len;           /*    44     4 */
    int                        len;                  /*    48     4 */
    char                       datagram[4096];       /*    52  4096 */

    /* size: 4152, cachelines: 65, members: 9 */
    /* sum members: 4145, holes: 1, sum holes: 3 */
    /* padding: 4 */
    /* last cacheline: 56 bytes */
};

Future TODO

Downsides

Other ideas

@csfrancis @wvanbergen

methodmissing commented 4 years ago

... and some CI massaging required

Screenshot from 2020-03-30 22-56-46

wvanbergen commented 4 years ago

This library now becomes dependent of a C compiler and ruby development headers - need to think about decoupling so that is optional.

In think there are two ways to address this:

  1. The datagram builder is already configurable. We could ship this as a separate datagram builder class (potentially as a different gem). The client can then configure to use this different datagram builder class, maybe using an environment variable.
  2. We create a separate gem that when loaded, will overwrite the datagram builder class with this native implementation.
methodmissing commented 4 years ago

Yeah that makes sense - like the sequel model

methodmissing commented 4 years ago

@wvanbergen objections to spawning statsd-instrument-c (blocked on a better name, patches welcome :smile:)? That would also introduce some other issues to think about: