Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

-O3 -fplugin-arg-dragonegg-enable-gcc-optzns miscompiles rnflow.f90 #8474

Closed Quuxplusone closed 13 years ago

Quuxplusone commented 13 years ago
Bugzilla Link PR10033
Status RESOLVED FIXED
Importance P normal
Reported by Jack Howarth (howarth.mailing.lists@gmail.com)
Reported on 2011-05-27 07:48:26 -0700
Last modified on 2011-05-31 11:08:28 -0700
Version trunk
Hardware Macintosh MacOS X
CC llvm-bugs@lists.llvm.org
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
While at r132184, the rnflow test case can be compiled with -O3 -fplugin-arg-
dragonegg-enable-gcc-optzns, it appears to be miscompiled and runs very slowly
compared to when compiled with -O3 -fplugin-arg-dragonegg-enable-gcc-optzns -
fno-tree-vectorize...

[MacPro:pb05/lin/source] howarth% /sw/lib/gcc4.5/bin/gfortran -
fplugin=/sw/lib/gcc4.5/lib/dragonegg.so -O3 -fplugin-arg-dragonegg-enable-gcc-
optzns rnflow.f90 -fno-tree-vectorize -o rnflow
[MacPro:pb05/lin/source] howarth% ./rnflow
0: 0: 0.000 -> Read sequence
  0: 0: 1.146 -> extract extrema
  0: 0: 1.151 -> Generate raw transitions counts
  0: 0: 1.157 -> Compute Markov matrix
  0: 0: 1.158 -> Calculate theoretical rainflow
  0: 0:15.769 -> Simulate random markov sequences
  0: 0:15.959 -> Completed simulation #    1
  0: 0:16.149 -> Completed simulation #    2

[MacPro:pb05/lin/source] howarth% /sw/lib/gcc4.5/bin/gfortran -
fplugin=/sw/lib/gcc4.5/lib/dragonegg.so -O3 -fplugin-arg-dragonegg-enable-gcc-
optzns rnflow.f90 -o rnflow
[MacPro:pb05/lin/source] howarth% ./rnflow
0: 0: 0.001 -> Read sequence
  0: 0: 1.151 -> extract extrema
  0: 0: 1.157 -> Generate raw transitions counts
  0: 0: 1.163 -> Compute Markov matrix
  0: 0: 1.163 -> Calculate theoretical rainflow
  0: 2:47.038 -> Simulate random markov sequences

where the time required for "Calculate theoretical rainflow" has increased 11
fold and the completed simulations never appear. Since -fno-tree-vectorize
eliminates the problem it must lie in the vectorization optimizations.
Quuxplusone commented 13 years ago

Looks like I implemented VEC_LSHIFT_EXPR and VEC_RSHIFT_EXPR wrong.

Quuxplusone commented 13 years ago
Fixed here:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20110523/121543.html
The resulting code is pretty rotten.  In fact when compiled without
-fplugin-arg-dragonegg-enable-gcc-optzns I see rnflow running faster than
when compiled with gcc-4.5; with -fplugin-arg-dragonegg-enable-gcc-optzns it
runs slower than gcc-4.5.
Quuxplusone commented 13 years ago
This isn't what I observe on x86_64-apple-darwin10.

[MacPro:pb05/lin/source] howarth% /sw/lib/gcc4.5/bin/gfortran -
fplugin=/sw/lib/gcc4.5/lib/dragonegg.so -O3  rnflow.f90 -o rnflow
[MacPro:pb05/lin/source] howarth% ./rnflow
...
  0: 0:31.785 -> Completed program execution

[MacPro:pb05/lin/source] howarth% /sw/lib/gcc4.5/bin/gfortran -
fplugin=/sw/lib/gcc4.5/lib/dragonegg.so -O3 -fplugin-arg-dragonegg-enable-gcc-
optzns rnflow.f90  -o rnflow
[MacPro:pb05/lin/source] howarth% ./rnflow
...
  0: 0:28.217 -> Completed program execution

[MacPro:pb05/lin/source] howarth% /sw/lib/gcc4.5/bin/gfortran -
fplugin=/sw/lib/gcc4.5/lib/dragonegg.so -O3 -fplugin-arg-dragonegg-enable-gcc-
optzns rnflow.f90 -fno-tree-vectorize -o rnflow
[MacPro:pb05/lin/source] howarth% ./rnflow
...
  0: 0:28.135 -> Completed program execution

These differences really need pbharness runs to be judged as significant. Also
it will be interesting to try tweaking the llvm optimizers down while leaving
the gcc front-end at -O3.
Quuxplusone commented 13 years ago

I've reimplemented VEC_RSHIFT_EXPR in a different way that should result in better code.