Open svkeerthy opened 7 months ago
Hi @m-atalla,
Apologies for the delay in response. We do not have a script for this yet. It would be great if you could help in this. SQLite Amalgamation is also very interesting and would be a valuable addition.
We have started integrating OMP with IR2Vec (See #105, which is a work in progress).
Please feel free to reach out if you need any inputs or have further questions. Will be happy to help :)
Best, Venkat
Hi, I wanted to follow up with profiling info on SQLite benchmark now that its added!
I used Linux perf
to get the profile data using the following commands:
$ perf record -g --call-graph dwarf build/bin/ir2vec --sym -level p ./src/test-suite/PE-benchmarks-llfiles-llvm17/sqlite3.ll -o sqlite.txt
$ perf script > /tmp/sym-perf.out
And I used the firefox profiler to analyze and upload the profile data which could be found here. From the call tree it seems that about 53% of the time is spent on parsing (not much could be done about it) and 44% is spent in IR2Vec_Symbolic::bb2Vec
which should a good target for parallelism. Fortunately it looks like #105 is already making progress on it!
Similarly, I generated a profile for the flow-aware (FA) mode which could found here. The call tree shows the following functions IR2Vec_FA::solveInsts
and IR2Vec_FA:func2Vec
with 33% and 24% of the time respectively.
It'd be happy to assist further as needed.
Thank you. Mohamed.
Hi @m-atalla,
Thanks for the perf report :) It exposes more opportunities for optimizations in addition to parallelization.
On the top of my mind, I have two things:
solveInsts
method that internally invokes the Eigen solver. We recently made Eigen an optional dependency. i.e., if Eigen is not available, we approximate the solution with a handwritten solver. It would be interesting to see if it reduces the current overhead.IR2Vec_FA::func2vec
method. It would be good to eliminate or reduce this overhead by using references or moves. Perhaps I will create separate issues to track these as the objective of these points is a bit different from that of the current issue. Please give me some time. I will have a more detailed look at the perf report and get back with more possible improvements.
Hi, this seems like an interesting enhancement that I'd like to help out on.
I think its important to have a baseline to compare against for any potential improvements, is the TimeTaken experiment suitable for that? Further, is there a script I could use to generate time taken as in
experiments/TimeTaken/TimeTaken_Algos.csv
?I'd be happy to add an additional benchmark as well, the SQLite Amalgamation might be an interesting option.