Language with Multiple Syntactic (Semantic?) Categories

Currently, EbU (and the original unembedding) assumes a single syntactic category. However, actual programming languages have multiple ones, such as patterns, statements, and (syntactic) values. In such situations, we often want to have different semantic domains for different syntactic categories, and semantic functions must refer to multiple domains instead of a single one. (A notable exception is patterns because we do not want to make the typing environments implicit in patterns.)

A workaround (see below) exists, but doing so for each DSL is tedious. So, I prefer to have a general solution addressed at the library level, but I want to see more DSLs of this sort to design a good interface effectively.

Workaround (requires no changes to EbU)

Consider an imperative language, which has two syntactic categories, namely expressions and statements. Since the current EbU assumed a single sorted language, a naive remedy is to prepare a unified type as below, where GType represents a kind of guest types.

data UType = ExpType GType |  StatementType GType

However, in the current EbU, the semantic domain sem is assumed to have kind [k] -> k -> Type, meaning that the context must also be [UType], while the semantic domains for the syntactic categories are naturally SemE, SemS :: [GType] -> GType -> Type. For this particular example, we just can ignore tags in UType, but in general, different syntactic categories may even be indexed by different sorts of types. (For example, a semantic domain for statements may also be indexed by their effects.) To reflect the situation, we change the definition of UType as below

data UType = ExpType GType |  StatementType GTypeS 
SemE :: [GType] -> GType -> Type 
SemS :: [GType] -> GTypeS -> Type

while keeping the context type.

Unified Semantic Domain

An observation is that every binder must bound variables of a type of kindGType, suggested by the semantic domain types SemExp and SemStatement. Thus, an idea is to pass around the witness that the types in UType in the context must come from types in GType.

data ComesFromGType uenv env where 
  NilOk :: ComesFromGType '[] '[] 
  ConsOk :: ComesFromGType uenv env -> ComesFromGType (ExpType t : uenv) (t : env) 

data USemImpl (env :: [GType]) (u :: UType) where 
  FromE :: SemE env t -> USemImpl env (ExpType t) 
  FromS :: SemS env t -> USemImpl env (StatementType t)

unFromE :: USemImpl env (ExpType t) -> SemE env t 
unFromE (FromE e) = e 

unFromS :: USemImpl env (StatementType t) -> SemS env t 
unFromS (FromS s) = s 

newtype USem uenv u = USem { runUSem :: forall env. ComesFromGType uenv env -> USemImpl env u }

We can lift the original semantic domains into the unified domain, when wrapped with EnvI

liftE :: EnvI SemE t -> EnvI USem (ExpType t)
liftE (EnvI e) = EnvI $ \tenv -> USem $ \wit -> e (convTEnv tenv wit)
  where 
    convTEnv :: TEnv us -> ComesFromGType as us -> TEnv as 
    ...

unliftE :: EnvI USem (ExpType t) -> EnvI SemE t 
unliftE (EnvI e) = EnvI $ \tenv -> genTEnv $ \wit tenv' -> 
  let FromE e' = runUSem (e tenv') wit in e' 
  where 
    genTEnv :: TEnv as -> (forall us. ComesFromGType us as -> TEnv us -> r) -> r 
    ...

-- We are not able to use EnvI for statements due to its kind EnvI :: ([k] -> k -> Type) -> k -> Type, 
-- but we can do similarly.

Unified Semantic Functions

For first-order language constructs, we just pass around ComesFromGType.

-- lifting of (+)
addU :: EnvI USem (ExpType Int) -> EnvI USem (ExpType Int) -> EnvI USem (ExpType Int)
addU = UE.liftFO2 addSemU
  where 
    addSemU :: USem uenv (ExpType Int) -> USem uenv (ExpType Int) -> USem uenv (ExpType Int) 
    addSemU (USem u1) (USem u2) = USem $ \wit -> FromE $ addSem (unFromE $ u1 wit) (unFromE $ u2 wit)

    -- addSem :: SemE env Int -> SemE env Int -> SemE env Int

~~Manipulation of witness is required for second-order constructs, but this must be doable by leveraging the fact that it must bound variables of typeExpType t.~~

letU :: EnvI USem a -> (EnvI USem a -> EnvI USem b) -> EnvI USem b letU = UE.liftSOn (ol0 :. ol1 :. End) letSemU where letSemU :: USem uenv (ExpType a') -> USem (ExpType a' : uenv) b -> USem uenv b letSemU (USem u1) (USem u2) = USem $ \wit -> FromE $ letSem (unFromE (u1 wit)) (unFromE $ u2 (ConsOk wit)) -- letSem :: SemE env a -> SemE (a : env) b -> SemE env b

[Editted, Oct 28] The above is not true as we will not obtain the constraint a ~ ExpType a' to use letSemU. So, we either need to change the type of letU to be:

letU :: EnvI USem (ExpType a) -> (EnvI USem (ExpType a) -> EnvI USem (ExpType b)) -> EnvI USem (ExpType b)

Or, use SemE/SemD externally and convert them to USem and back when we implement the semantic function via lifting. This would be necessary if we keep the surface HOAS syntax as:

class Exp e where 
  let_ :: e a -> (e a -> e r) -> e r
  ...

class Exp e => Stm s e | s -> e where   
  bind :: e a -> (e a -> s r) -> s r 
  ...

kztk-m / EbU