MPLLang / mpl

The MaPLe compiler for efficient and scalable parallel functional programming
Other
306 stars 18 forks source link

compiler crashes (unhandled exception `Option`) in `toRssa` pass #190

Closed shwestrick closed 4 weeks ago

shwestrick commented 4 weeks ago

On Ubuntu 20.04, commit ef2588b3a3ba1b5dca8f2c472a7c7f2250021596 (current main).

As far as I can tell, this happens when the scheduler is loaded but ForkJoin.par is never used in the program.

Minimal example

foo.sml

val _ = print ("hello world\n")

foo.mlb

$(SML_LIB)/basis/basis.mlb
$(SML_LIB)/basis/fork-join.mlb
foo.sml

Compiling foo.mlb raises an exception

$ mpl -verbose 2 foo.mlb
MLton [mpl] 20240604.030409-gef2588b3a-dirty starting
   Compile SML starting
      ...
      toRssa starting
         toRssa raised: Option

Digging

As far as I can tell, the bug happens when the scheduler is loaded but never used. (If I change the source in the above example to use ForkJoin.par, then it works; alternatively, if I don't load $(SML_LIB)/basis/fork-join.mlb at all, then it also works.)

I've confirmed that the bug is here: the call a 1 in turn calls varOp which raises Option.

In the generated SSA2, we see this:

  block L_2263 ()
    val x_4914: thread =
      prim PCall_forkThreadAndSetData (x_4830, x_0 (*obj ()*))

This makes sense? If we load the scheduler but don't use it, then there won't be any calls to PCall_getData, and therefore all data that flows through PCall_forkThreadAndSetData would be useless and could be eliminated.

Solution?

An easy fix might be to case on the RType for arg 1 of PCall_forkThreadAndSetData and just eliminate the prim call entirely in this case.

Alternatively, I've confirmed that if we add val ((), ()) = ForkJoin.pcallFork (fn () => (), fn () => ()) to the bottom of MkScheduler.sml, then the issue goes away, because this ensures that there's always at least one use of PCall_getData. Maybe this is easiest...

shwestrick commented 4 weeks ago

I decided to fix it with the easy solution: adding val ((), ()) = ForkJoin.pcallFork (fn () => (), fn () => ()) to the bottom of the scheduler as mentioned above.

MatthewFluet commented 3 weeks ago

Adding a use of ForkJoin.pcallFork seems reasonable. Dropping the PCall_forkThreadAndSetData primitive when the data argument is unit would also be acceptable; if the data argument is unit then there must not be any pcalls in the program and the PCall_forkThreadAndSetData will never execute (because it is downstream of a conditional that checks for a promotable frame, of which there will be none).