Early Results: Roadblocks and future work

Currently, running this benchmark fails, because too many parallel tasks compiling seems to trigger a race-condition in julia, as detailed here: https://github.com/JuliaLang/julia/issues/33183

However, profiling shows that currently there is essentially 100% contention if all threads are compiling, because there is a global lock on compilation:

Some related discussion on this from slack:

@nhdaly: I don't think we would need to hold a global mutex during compilation, right? Is this on the to-do list for future work for multithreading? @staticfloat: Is LLVM itself properly multithreaded? I could envision the compilation pipeline itself being made asynchronous (e.g. parsing, Julia optimization, LLVM optimization, LLVM codegen; each stage of compilation running independently of the rest, but within each stage, running serially) but I'm not sure that LLVM itself runs in a multithreaded fashion. That being said, I'm pretty sure a huge chunk of our time is spent in type inference, so maybe there are still large speedups to be had @nhdaly: Thanks Elliot, yeah that makes sense. I know that this kind of "slow code" has explicitly not been a priority for the multithreading team so far, so I'm not complaining, just exploring the current status. :slightly_smiling_face: Ooh, yeah, i hadn't considered whether LLVM itself could be run simultaneously from multiple threads... :cry: I would not be surprised if it isn't properly multithreaded. @vchuravy: LLVM can be multithreaded there has been a lot of work in that area recently

So there would need to be investigations on work done to both:

[ ] Allow parallel invocations of the julia compilation pipeline (parsing, lowering, type inference, julia optimization, etc)
[ ] Enable a multithreaded LLVM to allow parallel invocations of the LLVM part (LLVM optimization, LLVM codegen)

RelationalAI-oss / MultithreadingBenchmarks.jl

[WIP] All threads compiling parallelism scaling experiment #5

Early Results: Roadblocks and future work