CtopCsUtahEdu / chill

Other
8 stars 5 forks source link

Building in Release-mode creates unusable executable #5

Open bugwelle opened 3 years ago

bugwelle commented 3 years ago

Hi there,

first of all: Great project! :+1:

I looked a bit at your project and wanted to create a Release executable using CMake:

cd chill
# ROSEHOME, etc. are already set
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --config=Release -j 12

However, when I try to run a simple program (see below) it seems to run into an endless loop (at least that's what I assume).

I guess my compiler optimizes some undefined behavior in such a way, that chill runs into an endless loop. One CPU works at 100% and even after 5min, the program hasn't finished and I aborted it.

It took me a while to just delete my build folder and compile CHiLL in debug mode. And to my surprise, I instantly got the correct result.

While compiling, I get a lot of compiler warnings. I haven't looked into them, but I assume that there are a few that would need fixing. I only wanted to create this issue in case others run into the same bug. Furthermore, CHiLL works just fine when compiled in Debug mode. :-)

loop.script.py

from chill import *

source('./loop.cpp')
destination('loop_modified.cpp')
procedure('foo')
loop(0)

original()
print_dep()
print_code()

skew([0], 2, [1,1])
print_dep()
print_code()

permute([2,1,3])
print_dep()
print_code()

loop.cpp

void foo(double** A) {
    int t, z, y, x;
    for(t = 0; t < 10; t++) {
        for (z = 2; z < 30; z++) {
            for (y = 0; y < 20; y++) {
                for (x = 1; x < 10; x++) {
                    A[t+1][(z*20*10)+(y*10)+(x)] = A[t][(z*20*10)+(y*10)+(x-1)] + A[t][(z*20*10)+(y*10)+(x)] + A[t][(z*20*10)+(y*10)+(x+1)];
                }
            }
        }
    }
}

Output in Release mode

dependence graph:
0->0: A:flow(1, 0, 0, 1) A:flow(1, 0, 0, 0) A:flow(1, 0, 0, -1)
# I aborted here

Output in Debug mode

ependence graph:
0->0: A:flow(1, 0, 0, 1) A:flow(1, 0, 0, 0) A:flow(1, 0, 0, -1)
for(t2 = 0; t2 <= 9; t2++) {
  for(t4 = 2; t4 <= 29; t4++) {
    for(t6 = 0; t6 <= 19; t6++) {
      for(t8 = 1; t8 <= 9; t8++) {
        s0(t2,t4,t6,t8);
      }
    }
  }
}

dependence graph:
0->0: A:flow(1, 1, 0, 1) A:flow(1, 1, 0, 0) A:flow(1, 1, 0, -1)
for(t2 = 0; t2 <= 9; t2++) {
  for(t4 = t2+2; t4 <= t2+29; t4++) {
    for(t6 = 0; t6 <= 19; t6++) {
      for(t8 = 1; t8 <= 9; t8++) {
        s0(t2,-t2+t4,t6,t8);
      }
    }
  }
}

dependence graph:
0->0: A:flow(1, 1, 0, 1) A:flow(1, 1, 0, 0) A:flow(1, 1, 0, -1)
for(t2 = 2; t2 <= 38; t2++) {
  for(t4 = max(0,t2-29); t4 <= min(9,t2-2); t4++) {
    for(t6 = 0; t6 <= 19; t6++) {
      for(t8 = 1; t8 <= 9; t8++) {
        s0(t4,t2-t4,t6,t8);
      }
    }
  }
}

Setup

GCC: 11.1

$ /usr/bin/g++ --version
g++ (Ubuntu 11.1.0-1ubuntu1~20.04) 11.1.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

ROSE: 0.11.46

$ apt info rose            
Package: rose
Version: 0.11.46.0.1-0

IEGenLib: latest master branch from GitHub