giltene / jHiccup

jHiccup is a non-intrusive instrumentation tool that logs and records platform "hiccups" - including the JVM stalls that often happen when Java applications are executed and/or any OS or hardware platform noise that may cause the running application to not be continuously runnable.
Other
684 stars 92 forks source link

Visualization/log processing tools outdated #33

Closed rrva closed 5 years ago

rrva commented 5 years ago

Not any of the mentioned log processing tools were able to process my logs

jHiccupLogProcessor, failed to parse logs produced with jhiccup 2.0.10:

./jHiccupLogProcessor -i ~/jhiccup.log -o jhiccup_processed.log
Log file parsing error at line number 0: line appears to be malformed.

Using this file (only gzipped to make attachment smaller):

jhiccup.log.gz

I also tried some other tools linked in the README, none of them worked either:

https://github.com/ennerf/HdrHistogramVisualizer

https://hdrhistogram.github.io/HdrHistogramJSDemo/logparser.html

http://hdrhistogram.github.io/HdrHistogram/plotFiles.html

Please provide an up-to-date list of tools that support the log format of 2.0.10 or instructions on how to convert logs to other formats.

rrva commented 5 years ago

https://github.com/HdrHistogram/HdrHistogram/blob/master/HistogramLogProcessor worked fine

giltene commented 5 years ago

The file you link to above really does have a malformed line. The first line in the log (after the headers) has the string "HISTF" (which is the charateristic begininning of a base64 encoded compressed histogram buffer) 6 times in it. The newer log processor in the not yet released HdrHistogram master appears to be more forgiving (skips the line rather than terminate processing), but now I'm thinking that this is not a good thing, and that we should make it complain and terminate too (at least by default. we can add a "forgiving" mode flag).

As to how your log file got corrupted, I can say for sure, but I've seen this happen in situations where multiple jHiccups are running at the same time, and they were all given the same file to log into. By default, this is hard to make happen because jHiccup picks log file names with %pid in them, but if you manually set the log file to e.g. jhiccup.log, this can easily happen if you do so with multiple processes that run in the same environment.

I would suggest adding %pid and possibly %date to your specified name (e..g "-l hiccup.%date.%pid.hlog" is equivalent to the default).

rrva commented 5 years ago

Indeed I specified the logfile name:

/jHiccup-2.0.10/jHiccup -l /dump/jhiccup.log java

This was inside a docker container where I thought just a single process was running, but the app was packaged using https://github.com/puniverse/capsule which seems to fork another process.

I will change it to use %pid or stop using capsule.

When I get my logs uncorrupted, how can I produce hiccup plots like the one in the README named "Hiccups by time interval"? I want to correlate hiccups against reported high latencies by timestamp.

giltene commented 5 years ago

An interesting issue with containers is that %pid is container-local, so multiple containers can overlap on %pid. I just now (https://github.com/giltene/jHiccup/commit/aa184562f7604e8f47248de661e522ef89e6a9f0) added %host to the stuff you can put in the log name to help disambiguate log names on shared volumes in container environments.

As to tools to visualize the logs with, there are multiple tools that will let you visualize HdrHistogram logs. The tool I use the most is HistogramLogAnalyzer. Others are mentioned in the jHiccup README under "Hiccup Charts: Plotting jHiccup results".