Split generated kernels.cpp for faster builds

pdamme commented 1 year ago

tba

philipportner commented 4 months ago

FYI: I'm working on this one.

philipportner commented 4 months ago

I'll post this as an information dump for now:

ninja provides a log file which can be converted to chrome tracing format with the tool ninjatracing. Using ui.perfetto these can be simply viewed in the browser. I looked at this trace, the biggest offender is the kernels.cpp file mentioned in this issue, the second largest contribution to compile time is compilation of DaphneDialect.cpp.

As the granularity from the ninja log is not that helpful, I tried to use -ftime-trace from clang. As compilation with clang does somewhat work even thought execution fails, the time trace can still be extracted. With that, we have some profiling information about why individual compilation units take long.

kernels.cpp frontend takes roughly 25% of the time, with the backend taking the other 75%. Roughly half of the frontends time is spent doing template instantiations. Here the biggest offender is Eigen::EigenSolver<Eigen::Matrix<T with two types, double and float. The backend does not appear to have a single source of time spent, rather a lot of function-level optimization passes + codegen passes are performed.

Overall it looks like splitting kernels.cpp into multiple compilation units should do the trick for kernels.cpp.

Looking at the fine-grained -ftime-trace of DaphneDialect.cpp I believe that we should do the same there.

To reproduce the traces:

For the .ninja_log trace clone ninjatracing and run ./ninjatracing ../daphne/build/.ninja_log > trace.json
For -ftime-trace one needs to compile daphne with clang, export CXX=/path/to/clang++ and export CC=/path/to/clang and add set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -ftime-trace") to CMakeLists.txt. The traces can be found right alongside the compilation unit, for kernels.cpp that would be daphne/build/src/runtime/local/kernels/

ninja_log_trace.json DaphneDialect.cpp.json kernels.cpp.json

daphne-eu / daphne

Split generated kernels.cpp for faster builds #516