tum-ei-eda / seal5

Seal5 - Semi-automated LLVM Support for RISC-V Extensions including Autovectorization
https://tum-ei-eda.github.io/seal5/
Apache License 2.0
12 stars 6 forks source link

Parallelize transforms #52

Open PhilippvK opened 7 months ago

PhilippvK commented 7 months ago

We should be able to speed up the flow (transforms/backends) by introducing parallelism.

Possible levels:

Since we currently handle transforms via Python subprocesses, it should be trivial to parallelize those.

thomasgoodfellow commented 7 months ago

Most of the flow processing time goes on git operations (times in seconds, from se-henri server at DLR running 4 CPUs, piping demo.py through "ts" util)

phase seconds
loading core_desc 9
applying seal5 patches 9
transforming 14
generating instruction patches 7
applying instruction patches 64

The git python module used to apply the patches forks a lot of git processes (hundreds), which is where most of the 64 seconds is consumed. Possible remedies:

  1. Aggregate the patches into a single commit, rather than the commit-per-instruction approach. Having lots of commits can be good for debugging (bisecting) and cherry-picking but in a real "product" might want to squash them into a single "Add FOO extension" anyway?
  2. Write the patches as multiple commits for the instructions into a single mail patch, then apply that with "git am" (whereas current approach is "git apply + add + commit" with each instruction patch
PhilippvK commented 7 months ago

Following up after our discussion...

Tasks:

PhilippvK commented 7 months ago

BTW, you can find the full log output (incuding timestamps and all verbosity levels, and output of previous runs) in /tmp/seal5_llvm_demo/.seal5/logs/seal5.log

PhilippvK commented 6 months ago

I added instruction level parallelism to the behav_to_pat pass, which is the most time-critical transform, cutting down the runtime (for XCoreVSimd) from 80s to 10s on a 18C/36T CPU.

PhilippvK commented 6 months ago

This feature will be ported to other transforms as well when we have a more stable transforms-API which should help to reduce duplicated codes