rlavaee / codestitcher

Interprocedural Basic Block Code Layout Optimization
http://cs.rochester.edu/u/rlavaee/
Other
18 stars 2 forks source link

Getting no optimization for a simple program #6

Closed mahmoodn closed 4 years ago

mahmoodn commented 4 years ago

I have written a sample code like below

int main(){
   int a = 1;
   for (unsigned int i = 0; i < 10000000; i++) {
       for (unsigned int j = 0; j < 10000; j++) {
           if (j % 100 == 0)
              a += 4;
           else
              a -=1;
       }
       if (i % 400 == 0) {
          for (int k = 0; k < 10; k++) {
              a += 1;
          }
          a -= 2;
       }
   }
   printf("a=%d\n", a);
   return 0;
}

The output of make optimize shows no optimization. I want to know if there is any limitation with codestitcher or I have missed something else. The code looks normal.

mahmood@hpcc-ubuntu:~/codestitcher/test$ make optimize
../build/llvm/bin/clang  -B../build/binutils/bin/ -flto -O3 -Wl,-plugin-opt,-emit-bb-symbols test.o test_util.a -o test
../build/perf/perf record -e instructions:u --branch-filter any,u -o perf.data -- ./test
a=-510519487
[ perf record: Woken up 546 times to write data ]
[kernel.kallsyms] with build id 3a2171019937a2070663f3b6419330223bd64e96 not found, continuing without symbols
[ perf record: Captured and wrote 136.303 MB perf.data (330834 samples) ]
ruby ../scripts/gen_layout.rb -r test -p perf.data -L CS -D 4096
PERF FILES ARE: ["perf.data"]
reading profile: /home/mahmood/codestitcher/test/perf.data
[kernel.kallsyms] with build id 3a2171019937a2070663f3b6419330223bd64e96 not found, continuing without symbols
reading exe syms for: /home/mahmood/codestitcher/test/test
reading exe syms completed
reading profile completed
TOTAL WEIGHT: 9925220
(DYNAMIC) DISTANCE LIMIT: 4096
without affinity, chains: 1
(STATIC INTRA_FUNC) DISTANCE LIMIT: 4096
without affinity, chains: 1
RETURNS:false
ORIGINAL
CALLS: 16 => 9925220(66.7%) , 64 => 0(0.0%) , 4096 => 4962755(33.3%) , 262144 => 0(0.0%) , 2097152 => 0(0.0%) ,  => 0(0.0%)
OPTIMIZED
CALLS: 16 => 9925220(66.7%) , 64 => 0(0.0%) , 4096 => 4962755(33.3%) , 262144 => 0(0.0%) , 2097152 => 0(0.0%) ,  => 0(0.0%)
../build/llvm/bin/clang  -B../build/binutils/bin/ -flto -O3 -Wl,-plugin-opt,-bb-layout=cs -Wl,-plugin-opt,-emit-bb-symbols test.o test_util.a -o test
rlavaee commented 4 years ago

This could imply that the original layout is already good. You’d be in a better position to judge if you objdump the binaries.

Sent from my iPhone

On Mar 9, 2020, at 11:55 PM, Mahmood Naderan notifications@github.com wrote:

 I have written a sample code like below

int main(){ int a = 1; for (unsigned int i = 0; i < 10000000; i++) { for (unsigned int j = 0; j < 10000; j++) { if (j % 100 == 0) a += 4; else a -=1; } if (i % 400 == 0) { for (int k = 0; k < 10; k++) { a += 1; } a -= 2; } } printf("a=%d\n", a); return 0; } The output of make optimize shows no optimization. I want to know if there is any limitation with codestitcher or I have missed something else. The code looks normal.

mahmood@hpcc-ubuntu:~/codestitcher/test$ make optimize ../build/llvm/bin/clang -B../build/binutils/bin/ -flto -O3 -Wl,-plugin-opt,-emit-bb-symbols test.o test_util.a -o test ../build/perf/perf record -e instructions:u --branch-filter any,u -o perf.data -- ./test a=-510519487 [ perf record: Woken up 546 times to write data ] [kernel.kallsyms] with build id 3a2171019937a2070663f3b6419330223bd64e96 not found, continuing without symbols [ perf record: Captured and wrote 136.303 MB perf.data (330834 samples) ] ruby ../scripts/gen_layout.rb -r test -p perf.data -L CS -D 4096 PERF FILES ARE: ["perf.data"] reading profile: /home/mahmood/codestitcher/test/perf.data [kernel.kallsyms] with build id 3a2171019937a2070663f3b6419330223bd64e96 not found, continuing without symbols reading exe syms for: /home/mahmood/codestitcher/test/test reading exe syms completed reading profile completed TOTAL WEIGHT: 9925220 (DYNAMIC) DISTANCE LIMIT: 4096 without affinity, chains: 1 (STATIC INTRA_FUNC) DISTANCE LIMIT: 4096 without affinity, chains: 1 RETURNS:false ORIGINAL CALLS: 16 => 9925220(66.7%) , 64 => 0(0.0%) , 4096 => 4962755(33.3%) , 262144 => 0(0.0%) , 2097152 => 0(0.0%) , => 0(0.0%) OPTIMIZED CALLS: 16 => 9925220(66.7%) , 64 => 0(0.0%) , 4096 => 4962755(33.3%) , 262144 => 0(0.0%) , 2097152 => 0(0.0%) , => 0(0.0%) ../build/llvm/bin/clang -B../build/binutils/bin/ -flto -O3 -Wl,-plugin-opt,-bb-layout=cs -Wl,-plugin-opt,-emit-bb-symbols test.o test_util.a -o test — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

mahmoodn commented 4 years ago

Thanks. As I work further with the toolkit, it seems that this tool only works with C/C++ codes, due to the clang command. Am I right? Also, JIT has not equivalent for -O3. It has its own options. The same holds for binutils. So, I wonder if I can test/profile a java code which is the equivalent of your test program. Do you have any hint for that?

rlavaee commented 4 years ago

That’s right. This only works with C and C++ code.

Sent from my iPhone

On Mar 11, 2020, at 5:35 AM, Mahmood Naderan notifications@github.com wrote:

 Thanks. As I work further with the toolkit, it seems that this tool only works with C/C++ codes, due to the clang command. Am I right? Also, JIT has not equivalent for -O3. It has its own options. The same holds for binutils. So, I wonder if I can test/profile a java code which is the equivalent of your test program. Do you have any hint for that?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.