ox-lang / ox

Ox - An immutable statically typed Lisp for the JVM
https://ox-lang.org
66 stars 3 forks source link

Bootstrap code loader #29

Closed arrdem closed 7 years ago

arrdem commented 7 years ago

This PR is gonna take a while, just because it being "done" basically means that I've nailed the design for the analyzer & analysis state. This also requires having figured out the internal representation for types, macros, functions and a whole lot of fun things.

I think that the way this is gonna work out is that Ox will have to be a two-pass compiler system. From a simple data availability & phasing perspective, it's not possible to support macros in an load-order insensitive language, and load-order sensitivity is one of the things I was hoping to kill in this project by doing whole-program compiles.

The only way I see to unify a language which features global type inference with a language featuring macros is to first provide a load-order sensitive interpretive phase, during which time types are dynamically checked (but still present). This makes macro expansion possible in the interpretive phase. Once all code has been loaded and all macros have been expanded, then it is possible to do static type checking, code generation & optimization from the interpretive environment state.

The implication for the design of Ox's internal state is that it must provide abstract rather than concrete bindings, as for the most part concrete bindings won't be established yet.

Typing Haskell In Haskell is deeply influential in this design.

arrdem commented 7 years ago

Twitter thread of rubber-ducking with @lojikil https://twitter.com/arrdem/status/871098267727151104

TL;DR the way to move forwards here seems to be by using phasing and delayed evaluation, combined with imposing the restriction that all top level forms establish bindings. If we restrict the grammar of top level forms to

(ns foo
  ; require is restricted to as, and bare requirements.
  ; refer and rename could be supported, but qualified imports are largely better.
  (require (as bar b))
  (require (referring bar [+]))
  (require (renaming bar {div /})))

; scheme style syntax-rules & define-syntax are nice.
; clojure-style macro functions would also work.
; The essential property is that a module level binding is established either way.
(define-syntax <name> body)

; defs are used to name types, functions & arbitrary expressions.
; Again, a module level binding is established this way.
(def <name> &optional docstring body)

; providing allows macro uses to occur at the top level.
; Note that at present, macros cannot occur at the top level because it's not possible
; to statically determine what symbol(s) are provided by a given form (un-computed
; macroexpansion). The providing form solves this problem by allowing "top-level"
; macros to emit definitions only in a context where the set of defs to be emitted is
; explicitly associated with the form which will emit them.
(providing [& names] & body)

This design allows a trivial lex-time static analyzer to compute the full set of defs in the program WITHOUT having to actually perform macroexpansion. After the initial lex & analysis has been performed, then definitions can be lazily macroexpanded as required since the set of definitions is already fully determined. This design also has the advantage of fully supporting mutually recursive modules AND load-order independence.

lojikil commented 7 years ago

Love it (of course), and interested to see where you take it for sure (also, now watching this, lol dunno why I wasn't before).

arrdem commented 7 years ago

This isn't quite legal Haskell but it's pretty close to what I'm playing with once you allow for the PackageLoader class to be an interface instance serving to map PackageIdentifiers to resources and load them, returning a realized Package.

-- Ox.hs

data Form = Form
  { expr     :: LispVal
  , provides :: [Atom]
  } deriving (Show, Eq)

data Either3 a b c = A a
                   | B b
                   | C c

data FormId = FormId Integer

data Namespace = Namespace
  { name        :: Atom
  , forms       :: (Map FormId Form)
  , aliases     :: (Map Atom Atom) -- (require (as a b))
  , definitions :: (Map Atom (Either FormId
                                     Definition))
  } deriving (Show, Eq)

data Definiton = Alias Atom -- (require (referring a [b c]))
               | Data Any -- FIXME: Type definition
               | Function { documentation :: String
                          , macrop        :: Bool
                          , value         :: Any
                          }

data PackageIdentifier v = PackageIentifier Atom v

data Package v = Package
  { id           :: PackageIdentifier
  , namespaces   :: (Map Atom Namespace)
  , dependencies :: (Set PackageIdentifier)
  }

data VM = VM
  { loader   :: PackageLoader
  , packages :: (Map PackageIdentifier Package)
  }
vendethiel commented 7 years ago

AFAIK, "delay static type checking after macros are expanded" is hiw Nim does it, but I don't know how it works internally

arrdem commented 7 years ago

2017-06-05 01 03 48 2017-06-05 01 03 15

Today's notes... trying to hammer out how to phase the loading of packages given that packages need to be loaded namespace-at-a-time (files / readable resources are the loading unit) while packages and namespaces in packages may be mutually referencing/recursive and I'd like to choose an architecture which preserves a maximum of parallelism at least if possible.

Either packages need to, like namespaces, have explicit provides semantics to solve this problem or packages kinda need to go away and I just drop the whole multiple simultaneous versions goal of the project. Maybe there's a third road...

arrdem commented 7 years ago

SHIP AND ITERATE.