Open asalant opened 10 years ago
@asalant @bobzoller I am interested to log JSON data from Heroku to Splunk but can’t get Splunk to recognize it as JSON since Heroku includes a prefix of key value pairs with timestamp, etc., that is not valid JSON. How do you work around that and have Splunk recognize JSON from Heroku and visually display it as follows?
Issue described here: http://forum.loggly.com/discussion/391/logging-json-from-heroku
I just made our heroku-log-normalizer a public repo. There are no docs there, so I'll explain a bit here:
We had an issue for a long time where log lines would get split across multiple syslog messages coming out of Heroku's Logplex. This is most likely just a known limitation of syslog and not their fault, however it really complicates things when you're logging JSON. HTTP logdrains do not seem to be affected by this same issue... arbitrarily long log lines are transmitted as one, as expected.
Since Logplex wraps everything with syslog, you cannot simply ship your logs to Sumo (or another solution) and get native JSON parsing. Instead, heroku-log-normalizer transforms the syslog metadata into JSON, merges it with your application's JSON log line, and ships off raw JSON to Sumo.
The next problem is Heroku platform logs, or really, any log line that isn't JSON. heroku-log-normalizer tries to JSON parse loglines, but if it fails, the entire line is sent as the msg
property of a JSON object instead (along with the syslog metadata).
We include all sorts of identifying metadata in our application JSON logs -- like APP_NAME
and APP_INSTANCE
for example -- but when a Heroku platform log line comes through we get none of it. The syslog metadata provides some (heroku app name, etc), but we need more to properly attribute the line to an app and environment. To solve this, heroku-log-normalizer will merge in any HTTP query parameters passed to the drain endpoint.
heroku drains:add https://heroku-log-normalizer.yourdomain.com/drain?name=myapp&appInstance=production
a platform log will arrive looking like:
{"timestamp":"2015-09-24T16:34:07.699Z","msg":"at=info method=GET path=\"/\" host=corp-myapp.herokuapp.com request_id=deaa77fa-c9c8-4eed-a8d2-867ee4e85317 fwd=\"107.127.127.127,199.127.127.127\" dyno=web.2 connect=2ms service=39ms status=200 bytes=745586\n","syslog":{"prival":158,"facilityID":19,"severityID":6,"facility":"local3","severity":"info","type":"RFC5424","time":"2015-09-24T16:34:06.746Z","host":"host","appName":"heroku","pid":"router","msgID":null},"name":"myapp","appInstance":"production"}
and your application log will arrive looking like:
{"timestamp":"2015-09-24T16:34:06.752Z","name":"myapp","appInstance":"production","env":"production","pid":3,"hostname":"d14bc717-0e0c-4ad0-b07c-5e8a60a3bd06","custom":{"whatever":"you know"},"requestId":"deaa77fa-c9c8-4eed-a8d2-867ee4e85317","level":30,"res":{"status":200,"contentLength":745068},"responseTime":35,"msg":"GET / HTTP/1.1","v":0,"syslog":{"prival":190,"facilityID":23,"severityID":6,"facility":"local7","severity":"info","type":"RFC5424","time":"2015-09-24T16:34:06.758Z","host":"host","appName":"app","pid":"web.2","msgID":null}}
Hope that's helpful!
@bobzoller Wow, this is great! Thanks for explaining and sharing your code :smile:
/cc @aseemk @ryanwe @chanadian @davestern @matthew-cox
@bobzoller: this is incredible! Both the code and writeup. Thank you!
I was surprised to hear that the log line splitting isn't an issue for HTTP drains. That's awesome if so, but just want to double-check: are you sure your change didn't coincide with Heroku raising that limit from 1k bytes to 10k bytes?
https://devcenter.heroku.com/changelog-items/448
Thanks again very much. We're excited by this; just need to find some time now to try it out. =)
I'm going to go ahead file the first feature request now =): parsing logfmt-style logs too, e.g. from Heroku's router. We rely on that data heavily, and it'll be good not to lose that structure.
P.S. Thanks @gasi for discovering, asking, and mentioning!
JSON logging is cool, Splunk, bunyan and other tools make it cooler. But Heroku splits long log lines producing invalid JSON log lines. It would be great if Heroku fixed/changed this but in the meantime @bobzoller built a service that reassembles Heroku JSON log lines so that you can enjoy JSON logs from your Heroku apps.