Open safesparrow opened 1 year ago
I did a PoC, but it seems to break on FCS. I'm not sure why - there is either a dependency between phases I'm not aware of, or there is a bug in scheduling the items/propagating state. Node complaining is file idx 7, phase 3:
Update: I had a few bugs, since fixed. FCS compilation now works 🎉
Very first snapshot:
Worth noting the ratio between time spent in each phase 1, 2, 3:
Timings before & after:
# Sequential:
Real: 32.8 Realdelta: 15.8 Cpu: 18.4 Cpudelta: 10.9 Mem: 1445 G0: 1245 G1: 242 G2: 1 [Optimizations]
# Parallel:
Real: 29.3 Realdelta: 12.5 Cpu: 13.2 Cpudelta: 4.2 Mem: 1641 G0: 1250 G1: 258 G2: 1 [Optimizations]
So less speedup than I'd have expected.
EDIT: Actually I forgot to enable Server GC in the latest test. Timings with Server GC:
# Sequential:
Real: 22.9 Realdelta: 12.6 Cpu: 44.5 Cpudelta: 12.8 Mem: 4054 G0: 7 G1: 3 G2: 1 [Optimizations]
# Parallel 1:
Real: 18.5 Realdelta: 8.0 Cpu: 45.8 Cpudelta: 13.7 Mem: 4423 G0: 4 G1: 2 G2: 1 [Optimizations]
# Parallel 2:
Real: 17.4 Realdelta: 7.3 Cpu: 47.8 Cpudelta: 15.6 Mem: 4355 G0: 4 G1: 2 G2: 1 [Optimizations]
Bonus quest: The above has a limited potential for parallelisation.
What can speed it up further is a graph-based approach yet again.
For wider context, there are a few different ways optimisation of a file might behave. It might:
inlines
/other const values get evaluated)Case 4. can be ruled out because the code doesn't allow for this kind of dependency.
If:
For 2. the main issue would be making the optimization methods additive, ie. have the equivalent of AddResultsToTcState
for optimization.
EDIT: I had a quick look, and I'm cautiously optimistic that it will be fairly trivial to make file optimization return a delta that can then be combined with others.
Not type-checking related, but I'm doing a short investigation into the following:
In Release mode, there are 3 main optimisation phases/rounds. Round 1 only uses results of round 1 for previous files, round 2 only uses results of round 2 for previous files etc.
This means that we can easily increase parallelisation.