Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

-Oz is not respected by LSR (-enable-iv-rewrite revealed fluctuations in code size) #11516

Open Quuxplusone opened 12 years ago

Quuxplusone commented 12 years ago
Bugzilla Link PR12342
Status NEW
Importance P enhancement
Reported by Jörg Sonnenberger (joerg@NetBSD.org)
Reported on 2012-03-24 00:32:01 -0700
Last modified on 2012-03-27 17:30:27 -0700
Version trunk
Hardware PC Linux
CC anton@korobeynikov.info, atrick@apple.com, efriedma@quicinc.com, llvm-bugs@lists.llvm.org
Fixed by commit(s)
Attachments enable-iv-rewrite.tgz (298267 bytes, application/x-compressed-tar)
Blocks
Blocked by
See also
Created attachment 8262
Files that compile to different sizes and list

Attached is the list of all files from NetBSD's libkern, libsa and libi386 that
differ in code size between -enable-iv-rewrite=false and -enable-iv-
rewrite=true. Some of them are not used in this context. The false.txt /
true.txt list the object sizes as I see it during the release build. compile.sh
*/*.i will compile the files with both options. Big regressions are e.g.
qsort.i, improvements are seen for subr_prf.i
Quuxplusone commented 12 years ago

Attached enable-iv-rewrite.tgz (298267 bytes, application/x-compressed-tar): Files that compile to different sizes and list

Quuxplusone commented 12 years ago
I'm having trouble knowing what to look at here. When I run compile.sh and
simply sum the .o sizes, I get a *larger* size for enable-iv-rewrite=true. I'm
using svn r153254.

false: 54760 bytes
 true: 55508 bytes

* .o sizes
  2596 cd9660.i.false
  2676 cd9660.i.true
   324 dkcksum.i.false
   324 dkcksum.i.true
  5032 dosfs.i.false
  5052 dosfs.i.true
  1948 exec.i.false
  1968 exec.i.true
  4268 ext2fs.i.false
  4264 ext2fs.i.true
  3512 ffsv1.i.false
  3504 ffsv1.i.true
  4156 ffsv2.i.false
  4148 ffsv2.i.true
   592 gets.i.false
   624 gets.i.true
   480 ip_cksum.i.false
   484 ip_cksum.i.true
  3560 lfsv1.i.false
  3552 lfsv1.i.true
  3592 lfsv2.i.false
  3584 lfsv2.i.true
  5200 loadfile_elf32.i.false
  4684 loadfile_elf32.i.true
  4888 loadfile_elf64.i.false
  5488 loadfile_elf64.i.true
  3596 minixfs3.i.false
  3588 minixfs3.i.true
  2248 netif.i.false
  2248 netif.i.true
  2128 qsort.i.false
  2780 qsort.i.true
  1336 strerror.i.false
  1348 strerror.i.true
  2024 subr_prf.i.false
  1920 subr_prf.i.true
  3280 ufs.i.false
  3272 ufs.i.true
Quuxplusone commented 12 years ago

Hm. I thought qsort.i had a large regression, but I might have messed up the list at that point. subr_prf.i is what triggered the investigation: it ended up with 120 Bytes overhead.

Quuxplusone commented 12 years ago

There is a lot of luck involved in getting the "right" codegen for a particular loop nest. Performance results are just as jittery as the size results shown here. The good news is:

I can leave this open because there is an opportunity to make the LSR pass aware of the -Oz option, and this is potentially a good example to motivate it. That said, it's very low on my list of enhancements to add.

Quuxplusone commented 12 years ago

Good enough. Working around the related PR 12348 side steps the issue completely for me, so keeping the example for analysis is good enough.