Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Provide easy to inspect vectorization information #19352

Open Quuxplusone opened 10 years ago

Quuxplusone commented 10 years ago
Bugzilla Link PR19353
Status NEW
Importance P enhancement
Reported by Gonzalo BG (gonzalo.gadeschi@gmail.com)
Reported on 2014-04-07 07:14:04 -0700
Last modified on 2015-02-04 03:18:20 -0800
Version trunk
Hardware All All
CC llvm-bugs@lists.llvm.org, michael.v.zolotukhin@gmail.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

GCC provides -ftree-vectorizer-verbose=x to inspect what the vectorizer is doing. This allows you to see e.g. why the vectorizer is not vectorizing a construct in which you expected vectorization (and to fix it).

AFAIK there is no easy way to obtain this information easily with clang. This is thus a feature request.

A problem that AFAIK this option has in GCC is that it spits out a lot of information if you use it for your whole program. It would therefore make sense to be able to delimit this information to code snippets using a pragma (e.g. similar to those for pushing/popping warnings).

Quuxplusone commented 9 years ago
Clang has following options that serve this goal:
-Rpass=loop-vectorize
-Rpass-missed=loop-vectorize
-Rpass-analyze=loop-vectorize

Also, one can turn on debug prints from vectorizer with '-mllvm -debug-
only=loop-vectorize'.

Do you find info provided by these options insufficient, and want to see
something else?
Quuxplusone commented 9 years ago
That is really _really_ good, thanks!

The only improvement I can think of is to selectively generate the information
for a single loop, by marking the loop with a pragma:

#pragma debug vectorize
for (...) {}

I would find such a feature useful because passing those flags to clang via
CXX="" generates the information for the whole program (which can be huge), and
working around a complex build system to compile a single TU with those flags
is typically a pain (and still generates too much information).

After profiling one has an idea of which loops are important, and querying
clang about the vectorization of those loops should, in my opinion, be a
trivial task such that developers actually do it.