plast-lab / cclyzer

A tool for analyzing LLVM bitcode using Datalog.
MIT License
96 stars 14 forks source link

Slowdown due to Value::printAsOperand() method #1

Closed gbalats closed 9 years ago

gbalats commented 9 years ago

This was first posted on the LLVM Dev mailing list by @kferles. Yet it remains unresolved.

The tool makes use of the Value::printAsOperand() method to print operands from several LLVM bitcode instructions to the CSV file. But this approach doesn't scale and the problem seems to be the Value::printAsOperand() method, based on some profiling.

The problem is the slow path of this method, which constructs a TypePrinting object from scratch, every time this path is triggered. It seems that, each time this slow path is taken, it invokes methods (e.g., TypeFinder::run()) that perform many module-wide calculations that are redundant, except for the first time they are performed. This whole process accounts for most of the execution time.

We should find a faster way to perform the same task without relying on any internal API, since we want to keep our tool as an LLVM client.

epsallida commented 9 years ago

The Value::getName() method can be used to get the name of a LLVM Value. It returns empty string in cases where the Value doesn’t have a name (e.g., %0, %1 or is a Constant).

gbalats commented 9 years ago

That's right! I have made some changes to call Value::getName() whenever possible. This did not make any significant difference, since most of the variables are unnamed. For the latter, I ported the numbering logic in LLVM's llvm-diff tool, which computes the variable numbering and uses it to print unnamed variables.

I also avoid calling the WriteAsOperand() method on constants, since they can be appended to a stream by calling a cheaperfunction (print I think).

Right now, only values of metadata type call the expensive WriteAsOperand() method. If we find a way to eliminate this, then I think that we will no longer face any scalability issues in the fact generation step.