Monomorphisation [do not merge yet]

Rewbert commented 2 years ago

The initial commit 5341984 can turn the following code

{- defaultValue has type Exp a, it's just meant to make the example more interesting
by performing an assignment, which was previously impossible because we couldn't
do type-specialized operations in polymorphic procedures. -}
setDefault :: forall a. DefSSMExp a => Ref a -> SSM ()
setDefault = box "setDefault" ["r"] $ \r -> do
    r <~ defaultValue @a

program :: SSM ()
program = boxNullary "program" $ do
  int32 <- var (1 :: Exp Int32)
  int64 <- var (1 :: Exp Int64)
  bool <- var true'
  fork [ setDefault int32
       , setDefault int64
       , setDefault bool
       ]

into the pretty printed code

entrypoint:
  program()

global variables:

program() {
  int *fresh0 = var 1
  int64 *fresh1 = var 1
  bool *fresh2 = var True
  fork [ setDefaultRefInt32(fresh0)
       , setDefaultRefInt64(fresh1)
       , setDefaultRefBool(fresh2)
       ]
}

setDefaultRefBool(bool* r) {
  r = False
}

setDefaultRefInt32(int* r) {
  r = 0
}

setDefaultRefInt64(int64* r) {
  r = 0
}

Which is a nice start.

The following program, where we have two different local functions that define a procedure fun

fun1 :: Ref Int32 -> SSM ()
fun1 = box "fun1" ["x"] $ \x -> do
    fork [ fun x ]
  where
      fun :: Ref Int32 -> SSM ()
      fun = box "fun" ["x"] $ \x -> do
          x <~ (0 :: Exp Int32)

fun2 :: Ref Int32 -> SSM ()
fun2 = box "fun2" ["x"] $ \x -> do
    fork [ fun x ]
  where
      fun :: Ref Int32 -> SSM ()
      fun = box "fun" ["x"] $ \x -> do
          x <~ (1 :: Exp Int32)

testprogram :: SSM ()
testprogram = boxNullary "testprogram" $ do
    int32 <- var (5 :: Exp Int32)
    fork [ fun1 int32
         , fun2 int32
         ]

Produce this pretty printed code

entrypoint:
  testprogram()

global variables:

fun1RefInt32(int* x) {
  fork [ funRefInt32(x)
       ]
}

fun2RefInt32(int* x) {
  fork [ funRefInt32(x)
       ]
}

funRefInt32(int* x) {
  x = 0
}

testprogram() {
  int *fresh0 = var 5
  fork [ fun1RefInt32(fresh0)
       , fun2RefInt32(fresh0)
       ]
}

Right now the machinery only relies on the name & types of a procedure to specialize it, so the above example produces wrong code. There are two ways forward - either we add source information to make the compiler understand that these are two different definitions, or we inspect the procedure body to determine if they're the same or not.

Rewbert commented 2 years ago

I'm gonna try to inspect the procedure body first, as that seems like the least amount of work. Adding source location information to procedures is also nice but it would become more code to maintain. If we do write a plugin do insert that information, that plugin could potentially be outdated every now and then, as GHC changes. Ideally, we would do this with just software.

Rewbert commented 2 years ago

I wrote a prototype implementation where procedure bodies are inspected. The synthesized names are a bit iffy now but that's easy to change. I piggybacked on the name-generating machinery I already had in place.

fun1 :: Ref Int32 -> SSM ()
fun1 = box "fun1" ["x"] $ \x -> do
    fork [fun x]
  where
    fun :: Ref Int32 -> SSM ()
    fun = box "fun" ["x"] $ \x -> do
        x <~ (0 :: Exp Int32)

fun2 :: Ref Int32 -> SSM ()
fun2 = box "fun2" ["x"] $ \x -> do
    fork [fun x]
  where
    fun :: Ref Int32 -> SSM ()
    fun = box "fun" ["x"] $ \x -> do
        x <~ (1 :: Exp Int32)

testprogram :: SSM ()
testprogram = boxNullary "testprogram" $ do
    int32 <- var (5 :: Exp Int32)
    fork [fun1 int32, fun2 int32]

produces

entrypoint:
  testprogram()

global variables:

fun1Refi32(int* x) {
  fork [ funRefi32(x)
       ]
}

fun2Refi32(int* x) {
  fork [ funRefi32fresh1(x)
       ]
}

funRefi32(int* x) {
  x = 0
}

funRefi32fresh1(int* x) {
  x = 1
}

testprogram() {
  int *fresh0 = var 5
  fork [ fun1Refi32(fresh0)
       , fun2Refi32(fresh0)
       ]
}

Which seems correct.

Rewbert commented 2 years ago

If there are arguments provided at the host-language level, this is also recognized and procedures are specialized. E.g in the following program the first argument n exists in the host language, as does the if-then-else expression. Depending on the value ofn`, different procedure bodies are produced. This is now recognized.

fun3
    :: forall a
     . (Num a, Ord a, SSMType a, DefSSMExp a, FromLiteral a)
    => a
    -> Ref a
    -> SSM ()
fun3 n = box "fun3" ["r"] $ \r -> do
    if n < 2 then r <~ defaultValue @a else r <~ (defaultValue @a + 1)

testprogram2 :: SSM ()
testprogram2 = boxNullary "testprogram2" $ do
    r <- var (5 :: Exp Int32)
    fork [fun3 1 r, fun3 2 r, fun3 3 r]

becomes

entrypoint:
  testprogram2()

global variables:

fun3Refi32(int* r) {
  r = 0
}

fun3Refi32fresh1(int* r) {
  r = (0 + 1)
}

testprogram2() {
  int *fresh0 = var 5
  fork [ fun3Refi32(fresh0)
       , fun3Refi32fresh1(fresh0)
       , fun3Refi32fresh1(fresh0)
       ]
}

Of course, in this case, it makes more sense to embed the parameter in the embedded language and leverage the conditional execution mechanisms in the target language to produce just one procedure. If you don't care about code size I guess this code would save you a few cycles though!

Rewbert commented 2 years ago

The following encodes a supervisor process. The intention is that the supervisor allocates some resources, hands them to the consumers and forks them, and then returns (which deallocates the resource).

-- | supervisor to allocate resource and apply 'consumers'
supervisor :: forall a . DefSSMExp a => [(Ref a -> SSM ())] -> SSM ()
supervisor procs = boxNullary "supervisor" $ do
    r <- var $ defaultValue @a
    fork $ map ($ r) procs

-- 3 different consumer processes, that uses the resource
client1 :: Ref Int32 -> SSM ()
client1 = box "client1" ["r"] $ \r -> do
    r <~ int32 5

client2 :: Ref Int32 -> SSM ()
client2 = box "client2" ["r"] $ \r -> do
    r <~ int32 10

client3 :: Ref Bool -> SSM ()
client3 = box "client3" ["r"] $ \r -> do
    r <~ true'

testprogram3 :: SSM ()
testprogram3 = boxNullary "testprogram3" $ do
    fork [supervisor [client1, client2], supervisor [client3]]

Generated code:

entrypoint:
  testprogram3()

global variables:

client1Refi32(int* r) {
  r = 5
}

client2Refi32(int* r) {
  r = 10
}

client3Refbool(bool* r) {
  r = True
}

supervisor() {
  int *fresh0 = var 0
  fork [ client1Refi32(fresh0)
       , client2Refi32(fresh0)
       ]
}

supervisorfresh0() {
  bool *fresh0 = var False
  fork [ client3Refbool(fresh0)
       ]
}

testprogram3() {
  fork [ supervisor()
       , supervisorfresh0()
       ]
}

Rewbert commented 2 years ago

Writing a polymorphic function that alternates the value of a reference between two predetermined values at a set interval can look like this:

alternate :: Ref a -> Exp a -> Exp a -> Exp Word64 -> SSM ()
alternate r e1 e2 d = fork [ alternateProcess r e1 e2 d ]
  where
      alternateProcess :: Ref a -> Exp a -> Exp a -> Exp Word64 -> SSM ()
      alternateProcess = box "alternateProcess" ["r","e1","e2","d"] $ \r e1 e2 d -> do
          while' true' $ do
              r <~ e1
              delay d
              r <~ e2
              delay d

delay :: Exp Word64 -> SSM ()
delay t = fork [delayProcedure t]
  where
    delayProcedure :: Exp Word64 -> SSM ()
    delayProcedure = box "delayProcedure" ["delay"] $ \delay -> do
        x <- var event'
        after delay x event'
        wait [x]

testprogram4 :: SSM ()
testprogram4 = boxNullary "testprogram4" $ do
    r <- var (1 :: Exp Int32)
    x <- var true'
    alternate r 2 3 50
    alternate x false' true' 50

And then becomes

entrypoint:
  testprogram4()

global variables:

alternateProcessRefboolboolboolu64(bool* r, bool e1, bool e2, uint64 d) {
  while(True) {
    r = e1
    fork [ delayProcedureu64(d)
         ]
    r = e2
    fork [ delayProcedureu64(d)
         ]
  }
}

alternateProcessRefi32i32i32u64(int* r, int e1, int e2, uint64 d) {
  while(True) {
    r = e1
    fork [ delayProcedureu64(d)
         ]
    r = e2
    fork [ delayProcedureu64(d)
         ]
  }
}

delayProcedureu64(uint64 delay) {
  event *fresh0 = var ()
  after delay then fresh0 = ()
  wait [fresh0]
}

testprogram4() {
  int *fresh0 = var 1
  bool *fresh1 = var True
  fork [ alternateProcessRefi32i32i32u64(fresh0, 2, 3, 50)
       ]
  fork [ alternateProcessRefboolboolboolu64(fresh1, False, True, 50)
       ]
}

So for all of these examples, we do get different, specialized procedures for different types. We can write nice polymorphic functions in the EDSL. I am certain that @koengit can find a counterexample that breaks this though, so I don't think this issue is complete yet lol.

ssm-lang / Scoria

Monomorphisation [do not merge yet] #76