gelisam / klister

an implementation of stuck macros
BSD 3-Clause "New" or "Revised" License
134 stars 11 forks source link

local-expand variants #260

Open gelisam opened 1 week ago

gelisam commented 1 week ago

I now realize that #117, #119, #216, #232, #228, #229, and this concrete use-case for local-expand all fit together into a master plan, a sequence of features which each build on the previous one:

  1. local-transformation macros
  2. global-transformation macros
  3. local-expand-to-functor
  4. local-expand-problem
  5. local-expand-monad

I will attempt to explain those features in the context of a running example, a DSL for specifying routes in an http server, in the style of servant:

(routes
  -- GET /tasks/1234
  -- get details about a specific task
  (:>
    "tasks"
    (capture [task-id TaskId])
    (GET Task))

  -- GET /tasks
  -- list all the tasks
  (:>
    "tasks
    (GET (List Task)))

  -- GET /tasks/completed
  -- list all the completed tasks
  (:>
    "tasks
    "completed"
    (GET (List Task)))

  -- GET /tasks/todo
  -- list all the tasks which are not yet completed
  (:>
    "tasks
    "todo"
    (GET (List Task)))

  -- POST /complete?task-id=1234
  -- mark a specific task as completed
  (:>
    "complete"
    (query-param [task-id TaskId])
    (POST Unit)))

local-transformation macros

:> is a local-transformation macro, it takes n path components and expands into n-1 applications of a binary operator:

(:>
  "tasks
  "todo"
  (GET (List Task)))
=>
(binary-:> 
  "tasks"
  (binary-:>
    "todo"
    (GET (List Task))))

It is a local transformation in the sense that it merely reorganizes its input sub-terms while leaving the sub-terms untouched and unobserved.

global-transformation macros

Let's look at a global transformation next. Suppose the implementation of the http server looks at each route from top to bottom, and that a common bug is that the /tasks/:task-id route is listed before the /tasks/completed route, and therefore when the server receives a request for /tasks/completed, it incorrectly attempts to parse the string "completed" as a task id. Then it would make sense to define a global-routes macro which looks at all the routes which looks at all the routes defined within its body, spots the conflicts, and reorders them to avoid the bug:

(global-routes
  (binary-:> "tasks"
    (binary-:> (capture [task-id TaskId])
      (GET Task)))
  (binary-:> "tasks"
    (binary-:> "completed"
      (GET (List Task)))))
=>
(routes
  (binary-:> "tasks"
    (binary-:> "completed"
      (GET (List Task))))
  (binary-:> "tasks"
    (binary-:> (capture [task-id TaskId])
      (GET Task))))

This is a global transformation because global-routes had to examine the details of its sub-terms in order to find the conflicts. In general, a global transformation typically examines its entire input, and thus expects that entire input to have a specific shape; here, global-routes expects each route to be constructed via nested binary-:> applications, not via the n-ary :>. That restricted shape is inconvenient.

local-expand-to-functor

The next step is to eliminate that inconvenience by allowing local-transformation macros to collaborate with the global-transformation macro. The idea is that custom-core-routes specifies that it expects binary-:>, not :>, which tells the expander to expand :> but not binary-:>. This way, the user can use the nice :> syntax while the custom-core-routes implementation can parse the simple binary-:> syntax.

(custom-core-routes
  (:>
    "tasks"
    (capture [task-id TaskId])
    (GET Task))
  (:>
    "tasks"
    "completed"
    (GET (List Task))))
=>
(custom-core-routes
  (binary-:> "tasks"
    (binary-:> (capture [task-id TaskId])
      (GET Task)))
  (binary-:> "tasks"
    (binary-:> "completed"
      (GET (List Task)))))
=>
(routes
  (binary-:> "tasks"
    (binary-:> "completed"
      (GET (List Task))))
  (binary-:> "tasks"
    (binary-:> (capture [task-id TaskId])
      (GET Task))))

One way to implement custom-core-routes is via local-expand, by having custom-core-routes traverse its input and call local-expand whenever it encounters :> or any other symbol it does not recognize. I propose a fancier solution: custom-core-routes should instead call local-expand-to-functor, a polymorphic Macro action which in this case returns a (MyCore Syntax). One of the constructors of MyCore is my-binary-:>. Inside the body of custom-core-routes, the local-transformation macros are given the option of returning a (MyCore Syntax) value instead of returning a Syntax. For example, the binary-:> macro uses the my-binary-:> constructor.

The reason the type is (MyCore Syntax) and not just MyCore is two-fold:

  1. it allows the expander to maintain hygiene by flipping the scopes on those Syntax values, and
  2. it is a convenient type for the common case in which custom-core-routes recursively calls local-expand-to-functor on the Syntax values in order to obtain a fully-expanded (Fix MyCore).

The local-transformation macro should use (which-problem) to check if a (MyCore Syntax) value is expected, because the only place where the expander knows what to do with that value is in the outermost macro of the Syntax expanded by local-expand-to-functor. The (MyCore Syntax) is constructed in the same phase as the local-expand-to-functor call site which matches on that value, so this is effectively a dynamically-typed call with extra steps.

At that call site, custom-core-routes matches on a finite number of core cases (the MyCore constructors), and gives them meaning by transforming them into e.g. the Syntax for a function which receives a request and the implementation of an http handler, and chooses whether to call that handler or move on to the next route.

local-expand-problem

In some circumstances, we don't want a closed set of core cases, we want an open set to which other libraries can contribute. For example, capture and query-param are only two of many ways one might parse part of an http request for routing purposes, and it would be nice to be able to add more ways to parse routes without having to modify the library which implements the global-transformation macro.

If this was Haskell, then instead of a MyCore datatype, we would define a newtype wrapper around a function type (-> Bool Identifier Identifier (-> Bool Identifier Identifier Syntax) Syntax Syntax), and we would establish a convention explaining what such a function must do:

Since this is Klister, not Haskell, and we are not merely calling a function, we are making an indirect call via the expander, we must use a slightly different approach. Instead of a function from an input type to an output type, the global-transformation macro's module should define a custom Problem constructor (which must thus be an open type) wrapping the input type and specifying the output type (which is Syntax in this case, but could be something like (MyCore Syntax) as well).

Then, instead of hardcoding capture and query-param as the only two valid possibilities, the library would export those as two example macros which solve that Problem, and document how other libraries can do the same. custom-problem-routes would construct the appropriate continuations, wrap them in that Problem constructor, and pass it (and the syntax which might contain (capture [task-id TaskId]) or a library-provided alternative) to local-expand-problem. It would receive the output, in this case a Syntax, and use it to construct its own output, perhaps a function which takes an HList of handlers and an http request and calls the right one.

local-expand-monad

In Haskell, it is also very common for this kind of newtype to wrap a function which returns a monadic action. For example, if we implement a custom type system which uses its own MyType instead of Klister's Type, then the local-transformation macros might want to unify two MyType values or create new unification variables. Or, in our running example, maybe the library wants to provide a monadic API for consuming the next path component, so that under the hood it generates the code which examines the right part at runtime, without having to burden the implementation of the alternative-to-capture with the details.

It would thus make sense to have an expand-to-monad variant whose Problem specifies the input type, the output type, and the monad (which must support liftMacro) which the local-transformation macro is allowed to use.