etsy / statsd-jvm-profiler

Simple JVM Profiler Using StatsD and Other Metrics Backends
MIT License
330 stars 93 forks source link

Fix flame graph input query by including all steps #13

Closed danosipov closed 9 years ago

ajsquared commented 9 years ago

I like the idea of having a unified view of all the stages of a job, but I think there might be some confusion because the memory and GC metrics are still stage-specific but the flame graph is for the entire job.

What do you think of having a toggle between a stage-specific and a unified view for all of the metrics?

danosipov commented 9 years ago

That makes sense - keep in mind as it is now, the visualization is broken as it queries a metric that doesn't exist in DB. I'll see how a dropdown can be added.

ajsquared commented 9 years ago

Ah interesting, that is working correctly for me. Let me confirm that I didn't forget to push something.

ajsquared commented 9 years ago

Are you using https://github.com/etsy/statsd-jvm-profiler/blob/master/example/StatsDProfilerFlowListener.scala#L26? I realize that I forgot to update that to produce metrics that are formatted in a way supported by this dashboard.

danosipov commented 9 years ago

Yes, modified to write to InfluxDB (graphite was causing too many problems)

ajsquared commented 9 years ago

Yeah, that was the same reason we ended up switching to InfluxDB. I've got an updated version of that FlowListener that will work with this dashboard; I'll push that here.

danosipov commented 9 years ago

:thumbsup: Thanks!!

ajsquared commented 9 years ago

I just pushed the updated FlowListener. Check that out and let me know if that works out better for you with the dashboard.

danosipov commented 9 years ago

Hey @ajsquared - the new FlowListener doesn't write any metrics. The previous listener set the following configuration properties, that the new one doesn't set, is it related?

//          conf.set("mapred.task.profile.params", baseParams)
//          conf.set("mapred.task.profile.maps", getTaskToProfile(numMapTasks,
//            String.format("statsd.profiler.map%s.task", stepNum), conf))
//          conf.set("mapred.task.profile.reduces", getTaskToProfile(numReduceTasks,
//            String.format("statsd.profiler.reduce%s.task", stepNum), conf))
ajsquared commented 9 years ago

This might be a Hadoop version mismatch on our parts. Since the initial FlowListener was written we had upgraded to a newer version of Hadoop where the mapred.task.profile.* properties were deprecated (http://hadoop.apache.org/docs/r2.5.2/hadoop-project-dist/hadoop-common/DeprecatedProperties.html). The new FlowListener uses the new names. The other difference is that support for the mapreduce.task.profile.map.params and mapreduce.task.profile.reduce.params properties was added, which the new FlowListener uses to set the prefix for metrics differently to distinguish the metrics from the map and reduce tasks.

You should be able to change the properties back to the older name without issue, aside from the use of the new map- and reduce-specific properties

ajsquared commented 9 years ago

The profiler now supports InfluxDB 0.9.x and uses tags on the metrics for querying rather than the hierarchical metric names (the example flow listener has been updated to reflect this). Let us know if you're still experiencing problems!