heroku / buildpacks-ruby

Heroku's Cloud Native Buildpack for Ruby applications.
BSD 3-Clause "New" or "Revised" License
1 stars 1 forks source link

Add Ruby app metrics support #172

Closed schneems closed 1 year ago

schneems commented 1 year ago

Ruby supports "language specific metrics (beta)" https://devcenter.heroku.com/articles/language-runtime-metrics-ruby through the barnes gem https://github.com/heroku/barnes and the heroku/metrics buildpack https://github.com/heroku/heroku-buildpack-metrics. This PR ports the logic of heroku/metrics to the Ruby CNB buildpack.

Implementation

At build time, this buildpack will download a statsd client (named "agentmon") from S3 and place it on the path. There are currently no explicit interfaces for running background/daemon processes via the CNB spec, however there is an interface that guarantees it will execute arbitrary code (exec.d). This interface is intended to set dynamic environment variables at runtime. As a side effect, this interface is used to schedule a background daemon via the start-stop-daemon command docs.

The start-stop-daemon calls our program, a new rust script in bin/agentmon_loop. This rust script will boot agentmon in a loop if the necessary environment variables are set.

GUS-W-13115697. GUS-W-13815707.

schneems commented 1 year ago

I updated agentmon_loop I'm still looking at your other comments for main.rs and the layer probably tomorrow.

schneems commented 1 year ago

Updated to address the rest of the comments. I reworked a lot. Moved the bash script that schedules the daemon into a rust script. Moved to a hardcoded URL.

schneems commented 1 year ago

I think there's a race condition in the tests previously it was failing with

thread 'test_barnes_app' panicked at 'assertion failed: `(left contains right)`
left (unescaped):
    PID TTY      STAT   TIME COMMAND
      1 ?        Rs     0:00 ps x
     14 ?        R      0:00 [start-stop-daem]

left (escaped): `"    PID TTY      STAT   TIME COMMAND\n      1 ?        Rs     0:00 ps x\n     14 ?        R      0:00 [start-stop-daem]\n"`
right: `"agentmon_loop --path"`', buildpacks/ruby/tests/integration_test.rs:138:21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Now it's failing with

---- test_barnes_app stdout ----
    PID TTY      STAT   TIME COMMAND
      1 ?        Rs     0:00 ps x
     14 ?        R      0:00 start-stop-daemon --start --background --exec /layers/heroku_ruby/metrics_agent/agentmon_loop -- --path /layers/heroku_ruby/metrics_agent/bin/agentmon

thread 'test_barnes_app' panicked at 'assertion failed: `(left contains right)`
left (unescaped):
    PID TTY      STAT   TIME COMMAND
      1 ?        Rs     0:00 ps x
     14 ?        R      0:00 start-stop-daemon --start --background --exec /layers/heroku_ruby/metrics_agent/agentmon_loop -- --path /layers/heroku_ruby/metrics_agent/bin/agentmon

left (escaped): `"    PID TTY      STAT   TIME COMMAND\n      1 ?        Rs     0:00 ps x\n     14 ?        R      0:00 start-stop-daemon --start --background --exec /layers/heroku_ruby/metrics_agent/agentmon_loop -- --path /layers/heroku_ruby/metrics_agent/bin/agentmon\n"`
right: `"agentmon_loop --path"`', buildpacks/ruby/tests/integration_test.rs:[138](https://github.com/heroku/buildpacks-ruby/actions/runs/6153895381/job/16698442352?pr=172#step:8:139):21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I think it looks like this:

So depending on when we read the results of ps we can get a different result. I could possibly start the container and then loop/sleep a few times to try to get some determinisim. Maybe there's a better way to assert that:

Another tool to use is to look at the specified output log file (if we set AGENTMON_DEBUG=1). It would probably be best to print that out anyway in case an error comes up.

schneems commented 1 year ago

I fixed the tests (https://github.com/heroku/buildpacks-ruby/pull/172#issuecomment-1714857517) to be more deterministic, but it's still based on a timeout (which can fail if the machine is slow or docker is slow due to timing conditions). It's better than it was, though. Please take a look. Specifically, this line does not seem great https://github.com/heroku/buildpacks-ruby/pull/172/files#diff-17c88161984214fafac5afdc89a5919f2dbcfb1165f2560fd255809316095fe8R131.

I applied all of your requested changes. Thanks for the 👀 . I've asked @Malax for another round of reviews.

schneems commented 1 year ago

I've updated to use a checksum with the libherokubuildpack shared tooling.