Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

ThinLTO doesn't apply all interprocedural GlobalOpt optimizations #32627

Open Quuxplusone opened 7 years ago

Quuxplusone commented 7 years ago
Bugzilla Link PR33657
Status NEW
Importance P enhancement
Reported by Charles Saternos (charles.saternos@gmail.com)
Reported on 2017-06-30 08:33:56 -0700
Last modified on 2019-01-16 12:56:07 -0800
Version trunk
Hardware PC Linux
CC ditaliano@apple.com, llvm-bugs@lists.llvm.org, tejohnson@google.com, xkspr7@gmail.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
LTO will apply the globalopt pass, but ThinLTO can't apply all these
optimizations right now.

$ cat a.cc
#include <stdio.h>
#include <assert.h>
#include <stdlib.h>

char x = 0; // shrunk to bool
char b = 1; // localized
bool c = false; // deleted
int p[] = {1, 2, 3}; // scalarized

void bb() {
    if(rand() % 8) {
        x = 1;
    }
}

int jj() {
    b = (rand() % 2);
    return b == 0;
}
$ cat b.cc
#include <stdio.h>
#include <stdlib.h>
extern char x;
extern bool c;
extern int p[];

extern int jj();
extern void bb();

void cc() {
  if(rand() % 8) {
    x = 1;
  }
}

int main(void) {
  printf("%d\n", p[1]++);
  bb();
  cc();
  if(x) {
    puts("Hi");
  }
  for(int i = 0; i < 200; i++) {
    printf("%d\n", jj());
  }
  if(c)
    puts("yo");
}

$ ../build/bin/clang++ a.cc b.cc -O3 --for-linker=-mllvm --for-linker=-stats -
flto=thin -fuse-ld=lld -o thin &>&1 | grep globalopt
         1 globalopt             - Number of globals deleted
$ ../build/bin/clang++ a.cc b.cc -O3 --for-linker=-mllvm --for-linker=-stats -
flto -fuse-ld=lld -o lto &>&1 | grep globalopt
   1 globalopt             - Number of globals deleted
   2 globalopt             - Number of functions converted to fastcc
   1 globalopt             - Number of globals localized
   1 globalopt             - Number of aggregate globals broken into scalars
   1 globalopt             - Number of global vars shrunk to booleans
   6 globalopt             - Number of globals marked unnamed_addr
Quuxplusone commented 7 years ago

Looks like we'll need to get some index-based interprocedural constant prop to get things lik the deletion of "c".

For "b", we should already be marking for internalization during the thin link, unless we decide to import jj() into the module with main(), which of course means it then has external references. However, there are a couple of possibilities that we have talked about for handling situations like this. One is that when we have a function like jj() that is only referenced by one other module, we may want to mark it for import as a local copy, rather than import as available_externally. Then we can mark it for internalization in its original module, where it can be deleted. That still leaves "b" exported, however. But with the existing ThinLTO index call graph, we can see that "b" is only referenced by its defining module (before the export due to jj() being imported), and if we have read/write info on the reference edges we would see it is only written once where it is defined, and could mark it for import as a local copy as well (marking the original for internalization). Another possibility is handling "b" via interprocedural constant prop, by marking the write reference with the constant value, and propagating this via the index.

Quuxplusone commented 5 years ago
Hi all,

It looks interesting to me. I will be happy to work on this.

Thanks,
Akshat Garg