flamegraph-rs / flamegraph

Easy flamegraphs for Rust projects and everything else, without Perl or pipes <3
Apache License 2.0
4.61k stars 141 forks source link

feat: add --iterations flags to specify how many times to run target binary #332

Open dvdplm opened 1 month ago

dvdplm commented 1 month ago

As discussed here, PR #255 seems to have stalled a bit.

This PR takes @bartlomieju's PR (#255) and addresses the review concerns. See the PR description in #255 for a detailed description of what the changes mean.

djc commented 3 weeks ago

I'm honestly not sure this is a good fit? The current implementation doesn't seem like it would work for non-perf backends. Maybe it would be better to run this outside flamegraph and then use flamegraph to process the recorded perf data?

dvdplm commented 3 weeks ago

I’m honestly not sure what the problem you see here is but I think being able to tweak the number of iterations that end up in the graph is generally useful.

Are there other backends available today? Beyond the specific implementation here, what is it about —iterations that port over poorly to such backends?

djc commented 3 weeks ago

I can see how it is generally useful, but it's not obviously a good conceptual fit for flamegraph, whose entire paradigm is currently about (a) executing a workload (using any of 3 backends, perf, dtrace or blondie), and then (b) processing the profile output from the backend. This seems to add a substantial intermediate conceptual step in, and you haven't even attempted to make it work with the other backends (and it's not clear that that's possible).

dvdplm commented 3 weeks ago

I see, well, I’m happy to have a stab at looking into what would be required for the other backends so we can see how well/poorly it fits.

dvdplm commented 3 weeks ago

Ok, so I had another look at the code, trying to better understand how the different backends are implemented. As far as I can see they are not cleanly separated or isolated by way of an API. Rather, my impression is that arch::initial_command deals with picking the proper command by plain old if-else-based heuristics and switching on target_os. Nothing wrong with that mind you and the code is small enough for it to be perfectly fine. The "contract" between initial_command and callers is an Option<String> for perf on linux and the file cargo-flamegraph.stacks for dtrace/blondie on other platforms.

On macos and dtrace the cargo-flamegraph.stacks file is appended to when running several times.

I am not sure how the windows and dtrace behaves but would assume it works the same as on macos: appends data.

On windows and blondie the file is created anew with std::fs::File::create so it will truncate it if it exists, so that will not work with my --iterations code, but that should be an easy fix.

On linux and perf, the trace data is passed around in an Option<String> and I am not sure how well the output of several appended runs is handled by perf-script. I'd need a linux host to investigate this further.

Overall this PR, while incomplete as outlined above, takes an approach that is consistent with the rest of the code base: do the simplest thing that works. Sure, there are warts and a few code smells here and there, but fixing that is certainly not what this PR tries to do.

djc commented 3 days ago

Alright, keeping it simple is reasonable though I'd like a few more comments about why/where the implemented approach works, and it would be good to make sure stuff on Windows doesn't get broken (or simply doesn't implement this option?).

(Sorry for the slow feedback.)