gilch / hissp

It's Python with a Lissp.
https://gitter.im/hissp-lang/community
Apache License 2.0
369 stars 9 forks source link

Rethink `if-else` #198

Closed gilch closed 1 year ago

gilch commented 1 year ago

With the exception of [# (and prelude, sort of), the bundled macros eschew Python injections. These are too opaque to play nicely with tree-walking macros.

if-else is currently implemented like this:

(defmacro if-else (test consequent alternate)
  `((lambda (,'test : :* ,'then-else)
      ((operator..getitem ,'then-else (operator..not_ ,'test))))
    ,test
    (lambda : ,consequent)
    (lambda : ,alternate)))

Its expansion is rather large for such a common operation. (Many other macros contain if-else in their expansions as well.)

# ifQz_else
(lambda test,*thenQz_else:
  __import__('operator').getitem(
    thenQz_else,
    __import__('operator').not_(
      test))())(
  (1),
  (lambda :(2)),
  (lambda :(3)))

Three lambdas and six calls (including two imports). Had this been a special form, those would be zeros. There's no real need to alter the compiler for that, however, as stated in the old FAQ, a string-injection macro can do it:

(defmacro if-else (condition consequent alternate)
  "Compiles to if/else ternary conditional expression."
  (.format #"(({})if({})\n else({}))"
           : :* (map hissp.compiler..readerless
                     `(,consequent ,condition ,alternate))))
#> (if-else :a :b :c)
>>> # ifQz_else
... ((':b')if(':a')
...  else(':c'))
':b'

No more overhead than Python, more concise (though probably still not particularly readable), and also completely opaque to tree-walkers. They would all have to special-case anything like this, if they can handle it at all, and since their subexpressions are also text, they can't walk subtrees either, short of parsing Python. This is really not acceptable short of a manual micro-optimization. It can't be the default.

Hebigo has a different approach.

# hebi.basic.._macro_.if_
__import__('hebi.bootstrap',fromlist='?')._if_(
  (1),
  (lambda :(2)),
  else_=(lambda :(3)))

It uses a helper function, which it just imports. (Hebigo does use injections a lot, so it might have trouble with code-walking macros in general, but that doesn't seem like an issue here.) This also requires the bootstrap to be installed. The bundled macros have a rule against that, but the implementation could be simplified enough to inline, if it were injected. The subexpressions don't get stringified until compilation and thus remain accessible to code walkers, so (in that regard) it is no more opaque than using standard-library functions, which is not worth the cost of avoiding.

Note that only the body need be injected, although the entire lambda might also work. (And would be no worse than using an imported helper function. Implementation is the clearest name.)

(defmacro if-else (condition consequent alternate)
  `((lambda ,'bca .#"c()if b else a()")
    ,test
    (lambda : ,consequent)
    (lambda : ,alternate)))
#> (if-else :a :b :c)
>>> # ifQz_else
... (lambda b,c,a:c()if b else a())(
...   ':a',
...   (lambda :':b'),
...   (lambda :':c'))
':b'

Three lambdas (still), two calls, and one inject.

Is there some weird lambda calculus reason I'm not thinking of not to do this? I feel like just using standard-library calls is no worse. If this is allowed, what else would be simplifiable in this way?

gilch commented 1 year ago

I remember thinking I might need weird lambda calculus tricks to implement a yield macro. I thought about continuation-passing style and call/cc. Still not sure if that's doable. However, Ensue might make that unnecessary.

The full injection is not quite as bad as I made it sound. If a tree-walker could recognize the if-else, it could avoid expanding it before processing its subexpressions, which is basically how it would work as a special form. Each special form has to be special cased in the walker. That's why having only two of them makes writing code walkers a lot easier. All other (non-string injecting) macros could be macroexpanded to make them expressible in terms of these.

I feel like string-injection macros might be a common problem that walkers can't be expected to anticipate. They might kind of work like the "compiler macros" of Common Lisp (which are not the same thing as "macros"). Maybe if there were a standard way to mark these (an expand-after macro?) or a way to choose between an injected or non-injected implementation (an *inject* var?) then this could be made to work better.

gilch commented 1 year ago

I'm tempted to just do this. The inject-body version, I mean. But it's going to break a lot of examples in the docs, which would be a pain. I feel like I should finish the current doc project before I begin another one. Unlike a full injection, which would probably require a redo of all affected examples, I could probably just do a find and replace. Maybe it won't be that bad.

gilch commented 1 year ago

Looks like not that bad. Undoing it would be harder, but maybe still possible with a regex find-and-replace.

gilch commented 1 year ago

Looks like I can do @ lists and # sets just as easily.

There are a number of other macros that could benefit from this treatment, but they're more complicated. The best way to do % dicts is probably to create a dict literal in the body with string manipulation. The unpacking makes it tricky, but it's doable. && and || would be trivial if they were strictly binary. Making them multiary recursively like they are now wouldn't realize as much benefit as creating an inline helper function of the proper arity. Same with cond, which would just be a multiary if-else. when and unless would get a little bit simpler, since the nil case would not have to be passed in.

I feel like I implemented an injected throw before I found the walk_tb. Deserves another look if I can find it.

gilch commented 1 year ago

It was in the wiki here https://github.com/gilch/hissp/wiki/Bundled-macro-candidates#raise

.#"(x for x in'')", to make the generator, but it still needs to be closed. Could wrap in a doto. Longer injectables could close it too.

(g:=(x for x in'')).close()or g  # Leaks g.
(lambda g=(x for x in''):g.close()or g)()  # Doesn't, but longer.

Leaking is really not OK even with a gensym. (Although Hy did it a lot, last I checked.) The alternative is pretty much a doto, so I may as well doto.

gilch commented 1 year ago

doto doesn't auto-wrap like ->. Must have missed that. Fixed in feature branch. ensue is implemented in terms of ->, so should be fine. Were there any others?

gilch commented 1 year ago

when and unless were fairly easy fixes, but they break a lot of examples and can't be fixed with a simple regex find-and-replace, since their arity changed. I just have to go through all the examples.

%, &&, ||, and cond will need to compute a helper string to inject because it will depend on arity. These are going to be the most difficult. Their examples are going to change too, and not in a way I can fix with a regex. I'll probably have to redo all of them.

gilch commented 1 year ago

when/unless are done. The rest will have to wait for tomorrow, at least.

gilch commented 1 year ago

&& and || are done. Not as easy, but not really harder than writing a normal macro. Implementing one made the other pretty obvious.

cond may be slightly more difficult. % will probably be the hardest. The unpacking makes it tricky.

gilch commented 1 year ago

cond was harder, but doable given the && and ||. The old recursive macro definition in terms of if-else was really quite a bit simpler.

An optimization for these branching macros occurred to me. Any literal expressions need not be thunkified to delay evaluation. That's only required for evaluable forms (tuples and strings, and not the strings containing literals). However, this means that the body must avoid calling them like thunks as well, but still call anything thunkified. Matching these up would complicate the macros, although I suppose it's doable with enough code. if-else, when, and unless might be better defined in terms of cond in that case. It's not clear if && and || should be.

That just leaves %, which seems at least as difficult as cond was.

Given this technique, I'm wondering if there are any additional Python expressions I would want macros for. Generator expressions/comprehensions perhaps, but I don't see this working any better than the itertools and builtins. I'll have to look over that wiki page.

gilch commented 1 year ago

% is done. It was about as hard as cond.

I can't say that I should have done them this way in the first place. The old non-injecting implementations were worth writing for dogfooding purposes. The dynamic string-manipulation ones are fairly complex. Their expansions are a lot less noisy, but their implementations are a lot more difficult to understand. These would have been hard to write without the macro suite in place. I'm not sure this was an improvement. This feels like bloat. I'm tempted to just make them binary (or binary-recursive), but that wouldn't work for %.

gilch commented 1 year ago

Because it's a bit cryptic, I also wanted to mention somewhere that the X+1&-2 in cond is so that an odd number of arguments will crash (it ensures the helper always has an even number of parameters). cond is defined pretty early and I didn't have throw available yet.