lukego / live

Luke's Snabb Solutions - Live Coding Session Archive
2 stars 0 forks source link

Live #4: Source code highlighting JIT info (2) #4

Open lukego opened 6 years ago

lukego commented 6 years ago

Luke's Snabb Solutions - Live Coding

U1F984 commented 6 years ago

I'm not sure how simple/difficult this would be but I think what could be very helpful as well would be a view where the time spent within a particular trace would be shown more clearly; I imagined something like this, just for traces:

Profiler screenshot, source: msdn

Within small programs that might not be useful but for long running programs with potentially hundreds of traces it should give a good overview.

Also is there any way to look at the generated assembler code? Might not be that useful by itself but with something more intricate we could check which assembly instuctions were "hot", getting some information about not only which code was successfully compiled but how efficient it is. That would probably require more invasive profiling/higher frequency polling though to get meaningful results.

lukego commented 6 years ago

@Quicksteve The Studio profiler does support a breakdown of profiler data in three steps:

  1. Select the relevant profile (subsystem of an application.)
  2. Select a source location where a tree of traces starts.
  3. Select a root- or side- trace within the hierarchy and examine its code in detail.

Here is a screenshot from a virtual machine that has run several different benchmarks and profiled them separately:

studio

Is that similar to what you have in mind?

Also is there any way to look at the generated assembler code?

Currently it is only possible to examine the intermediate representation code. This can be viewed either textually or graphically (SSA directed graph.) I have found the IR code the most relevant for performance work.

I do want to support machine code. I have taken the preliminary step of making the VM record a 1:1 mapping between intermediate instructions and machine instructions. However the bit that's missing is to disassemble the machine code and display it. I was thinking of doing this in a complicated way using Intel XED (https://github.com/intelxed/xed/issues/68) but probably the first step should be a simple objdump.

lukego commented 6 years ago

@Quicksteve One problem that I want to tackle is mapping traces onto source code. I have the data that I need in the Studio GUI but I haven't worked out how to present it yet. I think the most effective approach will be to treat traces as primary and then cross-reference them to relate them to the source code.

I would love to present the source code directly as the primary interface but I am concerned that this is too far from reality with a tracing JIT. The LuaJIT profiler for example presents its results in terms of lines of code, functions, and call stacks - similar to your screenshot - and I have found it completely misleading in practice and that is why I focus on traces instead.

Open problem is writing clear documentation so that users will understand what a "trace" is!

U1F984 commented 6 years ago

I would love to present the source code directly as the primary interface but I am concerned that this is too far from reality with a tracing JIT. The LuaJIT profiler for example presents its results in terms of lines of code, functions, and call stacks - similar to your screenshot - and I have found it completely misleading in practice and that is why I focus on traces instead.

I agree, a direct mapping for source to trace is probably not possible because of the spanning nature of traces across functions or starting in loops. What could be useful however is having something along the lines of "here's a list of the hottest traces in the application" which could be an alternative entry point to manually picking out through the steps you outlined above.

I think also overloading the sourcecode with information about traces might not be the way to go, instead going from the trace overview to what sourcecode it represents and possibly annotating that excerpt is probably more useful? What would be really neat is maybe having some source code overview per root trace and somehow visualizing where side traces went off the main trace, then being able to open that side trace and view other side traces. I know the same could be achieved by reading the start line field of the traces, but it might be hard to imagine and follow for 10+ (side-)traces in the same function.

Finally, something you mentioned in a recent video: taking advantage of CPU performance counters for cache-misses etc: I think this would be really interesting data to see on a per-trace level; not sure if that would be possible though.