brimdata / super

A novel data lake based on super-structured data
https://zed.brimdata.io/
BSD 3-Clause "New" or "Revised" License
1.39k stars 64 forks source link

Stack traces not going to zqd logs #1097

Closed philrz closed 4 years ago

philrz commented 4 years ago

When narrowing down the root cause of #1087, we found that stack traces were being generated. However, the team was surprised to find that they were not landing in full in the zqd logs (though they were ending up in the NDJSON response: #1098).

To repro, run Brim at commit 8c496e0 and import the pcap wrccdc.2018-03-23.010014000000000.pcap.gz. After my own repro, here's what was in my run/logs/zqd-core.log:

{"level":"info","ts":1597336096.547731,"msg":"Starting","datadir":"/Users/phil/work/brim/run/data/spaces","pprof_routes":false,"zeek_supported":true}
{"level":"info","ts":1597336096.5538912,"logger":"httpd","msg":"Listening","addr":"127.0.0.1:9867"}
{"level":"dpanic","ts":1597338382.1411011,"msg":"Panic","error":"runtime error: slice bounds out of range [16:1]"}
{"level":"warn","ts":1597338386.383532,"msg":"Error writing response","request_id":"17","error":"malformed zng value"}
{"level":"dpanic","ts":1597338394.963316,"msg":"Panic","error":"runtime error: slice bounds out of range [123:58]"}
{"level":"warn","ts":1597338412.063315,"msg":"Error writing response","request_id":"36","error":"malformed zng value"}
{"level":"info","ts":1597340938.705417,"msg":"Signal received","signal":"interrupt"}
{"level":"info","ts":1597340938.7062452,"logger":"httpd","msg":"Shutting down","reason":"context closed"}
{"level":"info","ts":1597340938.712368,"logger":"httpd","msg":"Closed"}

As you can see, the "headline" of the stack traces are present, but not the details. In these situations we'd like to be able to ask users to send us logs and receive enough information that we stand a chance of fixing the problem, so we should make sure the full stack trace details are logged.

alfred-landrum commented 4 years ago

@mattnibs : I believe the code in #1166 covers this issue, do you agree?

philrz commented 4 years ago

@alfred-landrum: Indeed, I confirmed this in the verification I added to #1098, so @mattnibs killed two birds with one stone. Closing this one.