sstsimulator / sst-macro

SST Macro Element Library
http://sst-simulator.org/
Other
34 stars 41 forks source link

Figure out if we can fix build time issues. #526

Closed calewis closed 1 year ago

calewis commented 4 years ago

On a 24 core machine sst-macro make -j48 is taking about half as long to compile as all of LLVM. This feels a bit slow. This issue isn't critical, but I would like to investigate it for my own personal benefit.

calewis commented 4 years ago

Time to build sst-macro before changes:

make -j48

704.59user 49.07system 1:47.25elapsed 702%CPU (0avgtext+0avgdata 911352maxresident)k
0inputs+8343568outputs (0major+16032546minor)pagefaults 0swaps
jjwilke commented 4 years ago

One thing we could try which wouldn't take too much effort is to "flatten" the build. No more helper libraries. Just add all source files to a single libsstmac.

calewis commented 4 years ago

Some useful info for the build, hopefully will let us focus any efforts we have:

Total time in Frontend: 280.391584s
Total time in Instantiations: 108.983187s
    Diff: 171.408397s
Source Files to Look At
36.490055s in sst-macro/bin/clang/clangHeaders.h
30.25605s in llvm9-bootstrap//include/clang/Parse/Parser.h
25.961865s in sst-macro/sprockit/sprockit/serializable.h
25.727075s in sst-macro/sstmac/common/serializable.h
22.752565s in sst-macro/sstmac/common/event_scheduler.h
20.21179s in sst-macro/sstmac/common/timestamp.h
16.293427s in include/c++/5.4.0/ios
15.432697s in sst-macro/sprockit/sprockit/serialize_serializable.h
14.92125s in llvm9-bootstrap//include/clang/AST/OpenMPClause.h
14.703606s in include/c++/5.4.0/string

Instantiations to Look At
8.575131s in sprockit::BuilderDatabase::getLibrary<sstmac::Statistic<void>, sstmac::EventScheduler *, const std::__cxx11::basic_string<char> &, const std::__cxx11::basic_string<char> &, SST::Params &>
8.555578s in sprockit::BuilderLibraryDatabase<sstmac::Statistic<void>, sstmac::EventScheduler *, const std::__cxx11::basic_string<char> &, const std::__cxx11::basic_string<char> &, SST::Params &>::getLibrary
7.907054s in sprockit::BuilderDatabase::getLibrary<sstmac::StatisticOutput, SST::Params &>
7.889997s in sprockit::BuilderLibraryDatabase<sstmac::StatisticOutput, SST::Params &>::getLibrary
6.826532s in sprockit::BuilderDatabase::getLibrary<sstmac::ParallelRuntime, SST::Params &>
6.809628s in sprockit::BuilderLibraryDatabase<sstmac::ParallelRuntime, SST::Params &>::getLibrary
3.294431s in sprockit::BuilderDatabase::getLibrary<sstmac::sw::ThreadContext>
3.287196s in sprockit::BuilderLibraryDatabase<sstmac::sw::ThreadContext>::getLibrary
3.266206s in sprockit::BuilderDatabase::getLibrary<sstmac::sw::API, SST::Params &, sstmac::sw::App *, sstmac::Component *>
3.258856s in sprockit::BuilderLibraryDatabase<sstmac::sw::API, SST::Params &, sstmac::sw::App *, sstmac::Component *>::getLibrary

I'll ignore clang for now and point out that we spend like 10% of our time doing serializing stuff. I'll try to figure out how scalable the build is next.

calewis commented 4 years ago
Threads Speedup
1 1
3 2.5
6 3.8
8 4.2
12 4.9
24 5.5

We run out of parallelism somewhere around 12.

berquist commented 1 year ago

Now that development is moving to elements, we are focusing on build system improvements there instead.