joelburget / lvca

language verification, construction, and analysis
https://lvca.dev
MIT License
20 stars 0 forks source link

Core language #6

Open joelburget opened 4 years ago

joelburget commented 4 years ago

I think it's important to have a core language that serves as a good default target.

My initial attempt looks like this:

type core =
  (* first four constructors correspond to regular term constructors *)
  | Operator of string * core_scope list
  | Var of string
  | Sequence of core list
  | Primitive of primitive
  (* plus, core-specific ctors *)
  | Lambda of sort list * core_scope
  | CoreApp of core * core list
  | Case of core * core_scope list
  (** A metavariable refers to a term captured directly from the left-hand-side
  *)
  | Metavar of string
  (** Meaning is very similar to a metavar in that it refers to a capture from
      the left-hand-side. However, a meaning variable is interpreted. *)
  | Meaning of string

This issue is here to braindump some thoughts and references.

The first question to answer is what exactly is this core language for? It's to unambiguously define the semantics of a language (via translation to core). It's nice if we can do other things like step it with a debugger, but that's secondary.

Two important concerns, fairly unique to this project, are inclusion of terms from other languages and computational primitives.

By "terms from other languages" I mean that denotational semantics (in general / in LVCA) is about translating from language A to B. When using core, this is specialized to a translation from A to Core(B), where Core(B) is core terms with terms from B embedded. As an example, a case expression in Core(bool):

case(
  true();
  [ true(). false(), false(). true() ]
)

Some of the syntax is up for debate, but the point is that this is the equivalent of (OCaml) match true with true -> false | false -> true, but where booleans are not built in to core at all, they're from the language embedded in core.

The other concern I mentioned above is computational primitives, by which I mean primitives that are expected to actually do something. For example, you might have primitive #not, in which case you could write something like the above example as #not(true()). Here #not is not built in to the specification of core, but it's provided by the runtime environment. (I'm using a hash to denote primitives, but it's just a convention I think is nice).

With primitives we're now dealing with "core plus", core extended with a set of primitives. So the example #not(true()) is a term in Core(bool()){#not}. The syntax is complete undecided, but the idea is that this term can be evaluated in any environment that provides the #not primitive. I think this is really cool. You could easily find the set of primitives your language relies on. It would even be possible to do a translation to a different set of primitives, eg Core(bool()){#not} -> Core(bool()){#nand}.

joelburget commented 4 years ago

Should functions be applied to multiple arguments simultaneously? GHC core only allows a single arg. Mitchell's thesis and GRIN allow multiple args.

Should let-bindings be added as terms? All three I just mentioned have this. By the way, here's a great answer about why Gallina (Coq core) needs let-binding. As far as I can tell, not a concern for us, since we don't plan for our core to be dependently typed.