Closed brian-kephart closed 4 years ago
Update: added a test and fixed a bug.
Update: I've been running this in production for the last couple years, and have continued to update it as needed. It's still in good shape to merge if anyone else thinks it's a good idea.
Keeping the dream alive. I'm looking at this right now. Thanks for all of your work so far!
Okay, having looked at this indepth. I don't think I want this in PWK as-is. That's the bad news.
I have some good news, so stay tuned.
I don't like that we're having to stream and process logs to get the information that we want. It seems extremely CPU heavy and more likely to break the larger your application is. That's why I'm not super thrilled about mergging this in.
The good news though is there is a private API you can tap into to get this information directly. Here's what I think would be a good course of action.
You: Make a gem that is a really loose wrapper around this API endpoint, call it something like heroku_memory_metrics
or something. Please test this with webmock.
Me: Tell you about the API right now:
You can hit it like this:
curl -v -H "Authorization: bearer $(heroku auth:token)" "https://api.metrics.heroku.com/apps/<app-name>/formation/web/metrics/memory?step=1m&start_time=2019-11-05T16%3A51%3A00.000Z&end_time=2019-11-05T18%3A31%3A00.000Z&process_type=web"
Replace your <app-name>
with the actual name of the app. The start time and end time are formatted like this %Y-%m-%dT%H:%M:00.000Z`. Then the step duration is provided in minutes (i.e. if you say you want to get the last 7 days of data, then you will lose resolution).
The output is json
# ....
"memory.used.bytes.mean":[779493921.3333334,778211163.6666666,782405467.6666666,781353396.3333334,781283491.5,776302755.3333334,773401695,773122075,776054592.3333334,774373375.6666666,771154247,771935436,773551990.6666666,776956367.3333334,776145469,778655060.6666666,776257317,778253106.6666666,776449556,777658913.6666666,779958790.3333334,780018209.6666666,779832961.3333334,782926260.3333334,779556836.3333334,778745937,782020990,781440778,780434144.6666666,785519738.3333334,786341123,781605055,784198532.5,782440420.3333334,774523671.6666666,775911287,774764844,771168228.3333334,770539082.6666666,773115084,773132560.3333334,771989612.6666666,775282141.3333334,777658913.6666666,780500554.6666666,776652280.6666666,782950727,783436567.3333334,783869978.6666666,784453686.3333334,787637862,790493484,791842651.6666666,790778347,786288694.3333334,780371230.3333334,780434145,789252669,786183836.3333334,783800074,783674244.3333334,782146819,784226494.6666666,786634724.3333334,785079336.3333334,786687153.3333334,786376075.6666666,494720777.75,544647850,548060964.8333334,665335452.3333334,703804210.75,640382839,661647960.3333334,670968053.5,682776766.6666666,691147898.3333334,699931470,703542067,710315867.6666666,715132326.6666666,717372784.3333334,719774023.5,719064487,722189243.3333334,726862397,731741770.6666666,738386247.3333334,743045419.6666666,743104839,744261768,742594532,741832567,743723499,745761231.6666666,747134866.3333334,747613716,752562994.6666666,749710868.3333334,750102336.6666666,750429142.5]}}
That can then be parsed.
It's not documented (hence the private) but I don't think it's likely to go away anytime soon.
You: After you make the gem, refactor this PR to use the gem instead of streaming and processing the logs.
Also FWIW I added a giant warning to the README. In general I don't think people should be using this gem for any length of time.
Thanks for the review, and for the info. It will probably be awhile before I can get to these changes, but I'll submit when I can.
If you can't get to it, that's fine too. I just didn't want to leave you hanging. Thanks for the PR, this was a lot of code. Sorry for no comment for so long.
This PR uses code from the whacamole gem, modified to fit PWK. If you provide a Heroku API token in your config, then instead of attempting to measure total memory use directly PWK will get it from your Heroku logs with an HTTP request. This PR doesn't change existing operation until a Heroku API token is added to
PumaWorkerKiller.config
.If the HTTP request does not return any memory info, the
get_total
method will return 0.0 instead of raising an exception, so nothing blows up. Generally, the next request will return valid memory information.Note that this PR requires log-runtime-metrics and the
DYNO
environment variable, both of which are Heroku beta features. Again, though, nothing blows up if they stop working. Worst-case scenario is if Heroku removes these features and your app still has a Heroku API token inPumaWorkerKiller.config
, the app will continue to run but PWK will not work.There are no new tests, because I don't know how to test these features. Existing tests pass locally.