Closed Hiers closed 8 months ago
Hm, that is odd. I use the plugin with larger sample sizes than that so that shouldn't be the issue. It works fine even if the perf.data file gets huge (>10GB sometimes), just takes a while for perf to generate the report in that case.
It would be helpful if I could reproduce the issue. What perf command did you use to record and on what version of perf? Does it already happen on a simple test program like this:
#include <iostream>
void foo(int i) {
if (i % 10000) {
std::cout << i << '\n';
}
}
int main() {
for (int i = 0;i < 1'000'000'000; ++i) {
foo(i);
}
}
The other thing which I find weird here is that in the second picture with more samples, perf seems to exit much earlier - perhaps it crashed or reported an error? Normally you should get a warning if perf exits with non-zero exit code, but perhaps there is something going wrong there. The command that the plugin runs to get the output from perf in the callgraph mode is the following:
perf report -g folded,0,caller,srcline,branch,count --no-children --full-source-path --stdio -i perf.data
Does this produce any unusual output?
Running that perf command does error out with
Couldn't decompress data
0x132836 [0xffff]: failed to process type: 81 [Operation not permitted]
Error:
failed to process sample
I'm not sure why this happens, but if I run perf with no compression it works just fine, perfanno included. Annoying to have to keep these massive files in my system, but since it's either a perf or how perf is packaged in by my OS problem, I'll close the issue.
Thank you for the insight!
I found what I assume is a bug in perfanno when reading perf.data files that have more than ~14500 samples. When trying to load a perf.data that is beyond a certain size, it will never actually finish loading. The two images below are (ironically) perf data from perfanno reading a file with 14400 samples and a file with 14900 samples.
14400 samples
14900 samples
As the image suggests, the perfanno with the smaller file loaded in under a second. The other one, however, shows perf being run for a much shorter time, but without giving you annotated source code.