hylang / hy

A dialect of Lisp that's embedded in Python
http://hylang.org
Other
4.82k stars 368 forks source link

Add :while :if and :let modifiers to genexpr, comprehensions, and for #1371

Closed gilch closed 4 years ago

gilch commented 6 years ago

Since we need to compile comprehensions containing statements to generator functions anyway #588, this gives us the opportunity to implement Clojure's optional :let and :while loop modifiers used in its list comprehension.

Clojure also has a :when modifier for a filter predicate, which is equivalent to the if keyword in Python's comprehensions. So I propose we just call it :if in Hy. Hy currently can't use more than one filter predicate in comprehensions like Python can #848. This is a glaring deficiency for nested loops.

The modifiers follow the loop setup in the comprehension:

(genexpr <expression>
         [<binding1> <sequence1> <modifiers for loop 1>
          <binding2> <sequence2> <modifiers for loop 2>
           ...
          <bindingN> <sequenceN> <modifiers for loop N>])
def _hy_anon_fn_1():
    # pull out <sequence1> statements here
    _hy_anon_var_1 = <sequence1>
    for <binding1> in _hy_anon_var_1:
        <modifiers for loop 1>
        # pull out <sequence2> statements here
        _hy_anon_var_2 = <sequence2>
        for <binding2> in _hy_anon_var_2:
            <modifiers for loop 2>
             ...
            # pull out <sequenceN> statements here
            _hy_anon_var_N = <sequenceN>
            for <bindingN> in _hy_anon_var_N:
                <modifiers for loop N>
                # pull out <expression> statements here
                yield <expression>
_hy_anon_fn_1()

In that structure, an :if <condition> loop modifier compiles to

# pull out condition statements here
if not <condition>:
    continue

And a :while <condition> loop modifier compiles to

# pull out condition statements here
if not <condition>:
    break

And a :let [<bindings>] loop modifier compiles to = assignments for each binding.

If there aren't any statements, then the :if modifers will go directly in the comprehension, but the others modifiers will cause it to compile as a generator function.

We might also want to move the <expr> after the bindings list, like (genexpr [<bindings>] <body>), which is the ordering Clojure uses. This way we can have an implicit do in the form.

Kodiologist commented 6 years ago

I'm surprised you're not advocating a more parenthesized (or square-bracketed) syntax, considering your concern for indentation. I know that, in Common Lisp, one of the reasons that iter was created to replace loop was that loop's syntax wasn't deemed Lispy enough. loop is definitely less Lispy than this proposal, though: it has bits that look like infix operators, among other things. At any rate, a keyword-based design seems fine to me.

I also agree with moving the body to the end. That bit of right-to-left-ness in Python's comprehension syntax always seemed odd to me, and looks even odder in Lisp.

gilch commented 6 years ago

I'm surprised you're not advocating a more parenthesized (or square-bracketed) syntax, considering your concern for indentation.

Then you haven't been paying attention. 😉 Maybe most of this happened before we brought you on, but I was the one who advocated removing redundant brackets to make Hy more Clojure-like #853. I'm also the one who wrote Hy's version of Arc Lisp's if #830/#962 when we already had a Common Lisp-like cond. The new if is more like both Clojure's no-bracket cond and Python's if/elif/else statement.

My design priorities for Hy have generally been

  • When in doubt, defer to Python.
  • If you're still unsure, defer to Clojure.
  • If you're even more unsure, defer to Common Lisp.
  • Keep in mind we're not Clojure. We're not Common Lisp. We're Homoiconic Python, with extra bits that make sense.

--Hy style guide, explaining Hy's design

We can't do everything exactly like Python, or why not just use Python? On the other hand, we're not trying to re-implement Clojure on the Python VM--we're more tightly integrated with Python than Clojure is with Java. Hy is, first and foremost, Homoiconic Python. We're still Python (TM)! And we also need the Lisp bits to make it work. Python's "batteries included" philosophy applies to Hy too. We need to fill the Lisp gaps in the standard library. And I'm mostly filling these in with ideas from Clojure.

Clojure was designed with the benefit of hindsight (though some still like Common Lisp's style better), so where Clojure and Common Lisp disagree, I pick Clojure's version. But when Clojure doesn't have a good fit for Python, I look at Common Lisp before I look at other Lisps.

In this case, both Python and Clojure use keyword syntax in their list comprehensions. I don't really see a conflict on this point, so it seems like a good fit for Hy. There's no need to defer to Common Lisp this time.

Lisp is read (by humans) by indentation. Once you get used to it, it's as easy to read as Python. But if you don't follow the basic indentation rules, this doesn't work and the alternative is counting brackets, which humans can't do quickly or reliably, especially when you put them all on one line at the end, like we do in Lisp. Proper indentation is, first and foremost, about legibility.

But Lisp is edited structurally, with tools like ParEdit, and that's why some prefer Common Lisp's style over Clojure's--because there's more structure to work with. But the more minimal Clojure style is easier to remember, because there's less structure you have to produce. I think this makes it a better cultural fit for Pythonistas, who prefer Python's "fun and easy" approach. And I think it's not too hard to add that structure with ParEdit when you need it and then remove it again when you're done.

Python users are used to defining blocks of code by their indentation. Parinfer infers where the brackets go based on indentation. It's also easier to learn than ParEdit, since you only need to memorize one keybinding (and can get away without even using that). This makes it a great cultural fit for the Python community to edit Hy. I'd strongly oppose any changes to the language or Style Guide that would be incompatible with Parinfer. So proper indentation is also about easy editing.

gilch commented 6 years ago

I also agree with moving the body to the end. That bit of right-to-left-ness in Python's comprehension syntax always seemed odd to me, and looks even odder in Lisp.

I think Python got it from Haskell, which got it from set theory. I've heard that some like to use math notation in their editors. If we switched it that wouldn't look as mathy. But Lisp forms usually put the body at the end.

Python does it forwards when building lists in for loops, so it's not hard to understand forwards. I also think it's a better fit for Lisp with the implicit do, but on second thought, how should this work in a dict-comp? The key and value are easier to distinguish positionally without the implicit do. Otherwise we'd have to return a pair as the last form (like (key . value) or [key value]) or separate them with a keyword, (like key : value) which isn't consistent with our dict displays. I don't want to change only the other comprehensions.

So maybe we shouldn't add that to be consistent. do is typically used for side effects, which is usually not something we want in generators. If you need a do, maybe you should write your own fn and yield generator. Without the implicit do, I'm less motivated to change it to the end.

On the other hand, if we had a :top (:do? :with?) modifier for the bindings list, we could inject any code we like at the top of any loop, including break or continue or even yield. You could implement :if <condition> as :top (unless <condition> (continue)), and :while <condition> as :top (unless <condition> (break)) and :let [<bindings>] as top: (setv <bindings>). This setup seems pretty general, which is making me wonder if we need keywords at all, though Clojure's sytnax is shorter for the common cases. And without keywords, it's longer in the common case when you don't need a modifier, since you'd need a placeholder. And if you need to be that general, why not write your own generator?

Yeah. I'm thinking we should just mimic Clojure's design instead of trying to come up with a new one.

I'd like to hear more opinions about which end to put the body in though. Clojure and Python disagree here, so I also looked at Common Lisp. But it looks like it doesn't have a standard list-comp. I suppose you'd just use maps and filters. If you try to mimic a comprehension with a Python map, you'd actually put the body first, in the lambda.

Kodiologist commented 6 years ago

Maybe most of this happened before we brought you on

That's right; I only joined core in November 2016.

gilch commented 6 years ago

I think that special forms should generally be kept as simple as possible (but no simpler) that is, work exactly like Python, plus be able to use "statements" as expressions. The extra features should be in a core macro built on top of that instead.

The problem is, the macro will have to share a lot of logic with the compiler. Code duplication is bad. One way to avoid this would be for the compiler to expand to the macro invocation when it has to compile statements into a comprehension. But without better macro namespacing, this seems brittle.

Kodiologist commented 6 years ago

I've thought about this a bit. There are lots of natural ways to extend the idea of comprehensions, like your :let proposal. But, trying to simulate comprehensions (which we'd need to in order to compile the extended versions) seems error-prone because of scoping issues. I propose instead that we have fairly restrictive special forms that always produce real comprehensions, and encourage the use of generators and nested loops for anything that can't be accomplished with a comprehension.

For parallelism and brevity, suppose the comprehension forms are called compl for list comprehensions, compd for dictionary comprehensions, comps for set comprehensions, and compg for generator expressions. The following

 (compl
   :for x (range 5)
   :for y (range 8)
   :if (!= x y)
   :for z [x y]
   (* x y z))

would compile to

 [x*y*z
     for x in range(5)
     for y in range(8)
     if x != y
     for z in [x, y]]

So only the keywords :for and :if (or maybe we should spell :if as :when, since there's no else clause) are supported. The generation value is the last form, except for compd, in which case it's the last two forms. It is an error if, during compilation, a statement is produced at any step; every part needs to compile to a pure expression.

The generator equivalent to the above is (without nested indents, to make the parallelism clearer):

(list ((fn []
  (for [x (range 5)]
  (for [y (range 8)]
  (when (!= x y)
  (for [z [x y]]
  (yield (* x y z)))))))))

Since defining and immediately calling a function is often useful, particularly now that we have return, it could make sense to add a macro such as (defmacro cfn [&rest body] `((fn [] ~@body))) to core.

gilch commented 6 years ago

But, trying to simulate comprehensions (which we'd need to in order to compile the extended versions) seems error-prone because of scoping issues.

Concrete examples, please. These scoping issues don't exist, with the exception of list-comp in Python 2, which uses different rules from all the other comprehensions. Try it out.

In Python, spam = (foo(x) for x in bar()) is just syntactic sugar for

def anon():
    for x in bar():
        yield foo(x)
spam = anon()

They have exactly the same scoping rules. Python already compiles it to a generator function, so we can do the same without issue.

I'm not sure if we should bother implementing the Python2-scoped list-comp, but (in Python2) spam = [foo(x) for x in bar()] is just syntactic sugar for

anon = []
for x in bar():
    anon.append(foo(x))
spam = anon

I propose instead that we have fairly restrictive special forms

There are good arguments for making special forms as close to native Python as possible. Designing Hy that way from the start might have made sense, but we're currently depending on the compiler to make statements act like expressions, and make binary operators variadic, e.g. (+ 1 2 3) to ((1 + 2) + 3) and (get foo bar baz) to foo[bar][baz]. We could probably re-implement the variadic operators as macros based on binary special forms.

But I'm not sure if moving the statement-to-expression logic out of the compiler is such a good idea.

Kodiologist commented 6 years ago

So, as of Python 3, there is no semantic difference between comprehension syntax and an anonymous function (except for return, continue, and break, which aren't legal in Python comprehensions anyway)?

In that case, is there any reason to support comprehensions beyond the rule that it should be possible to produce any valid Python construct with Hy?

gilch commented 6 years ago

no semantic difference between comprehension syntax and an anonymous function

Let's be clear about a function vs a generator function. Between a generator expression and an anonymous generator there's no semantic difference. It has to contain yield. You can see this is true if you add yield expressions to a generator, e.g.

>>> list([(yield x), (yield x+x), (yield x*x)] for x in range(4))
[0, 0, 0, [None, None, None], 1, 2, 1, [None, None, None], 2, 4, 4, [None, None, None], 3, 6, 9, [None, None, None]]

If you think about its "expansion", the output makes perfect sense.

def _anon():
    for x in range(4):
        yield [(yield x), (yield x+x), (yield x*x)]
list(_anon())

So, yeah, you could do the same thing with fn/for/yield in Hy.

But I'm pretty sure a list comprehension (in Python 3) like [(yield x) for x in range(3)] actually compiles to something like

def _anon1():
    _anon2 = []
    for x in range(3):
        _anon2.append((yield x))
    return _anon2

Rather than a generator in a list constructor, like you might expect. That's why this list comprehension returns a generator, not a list:

>>> [(yield x) for x in range(3)]
<generator object <listcomp> at 0x0000025BE19FD308>
>>> spam = _
>>> next(spam)
0
>>> next(spam)
1
>>> next(spam)
2
>>> next(spam)
Traceback (most recent call last):
  File "<pyshell#22>", line 1, in <module>
    next(spam)
StopIteration: [None, None, None]

Again, if you think of the "expansion", it this behavior makes perfect sense. Note that the list return value came out in the StopIteration, just like when you use a return statement in a generator.

beyond the rule that it should be possible to produce any valid Python construct with Hy?

I wonder if they're better optimized in PyPy somehow. The hy2py output might look prettier. Yeah, I can't really think of another reason. We could implement generators as macros that expand to fn/for/yeild with no loss of expressive power. Similarly, comprehensions could be wrapped in anonymous functions (except list-comp in Python 2, maybe), with no loss.

Kodiologist commented 6 years ago

Aw geez, I didn't realize that stuff like [(yield x) for x in range(3)] is legal Python. I know [x for x in range(3)] is a list comprehension and (x for x in range(3)) is a generator expression, but the form with yield is a strange hybrid with no clear use case.

Anyway, given that scoping in a comprehension works just like function scoping, I guess we can use a special form that automagically produces a function instead of a comprehension when necessary. If we keep the :for and :if/:when keywords, we can let the user insert arbitrary forms in the middle, in which case they're inserted into the corresponding parts of the nested loop. For example,

(compl
  :for x (range 10)
  (setv y (* 2 x))
  :for z (range y)
  (foobar)
  (+ x y z))

becomes

(list ((fn []
  (for [x (range 10)]
    (setv y (* 2 x))
    (for [z (range y)]
      (foobar)
      (yield (+ x y z)))))))
gilch commented 6 years ago

Clojure's doseq is the equivalent of Hy's for (which is why I proposed renaming it to doiter #1125), and doseq also has the modifiers :while :when and :let, just like Clojure's comprehension.

I'd like to add them to Hy's for, and for consistency, Hy's comprehensions (and genexpr) should have the same modifier syntax.

compl and friends don't seem compatible with an expanded for that way.

Kodiologist commented 6 years ago

Wouldn't it be better to provide compl etc. as fairly literal equivalents of Python comprehensions and doseq as a macro using compl, rather than the reverse? We want the special forms to be equivalents of Python constructs and fancier stuff to be moved to macros, right?

gilch commented 6 years ago

What? Hy's for is already a macro, not a special form. It expands to nested for*. And Clojure's doseq is for side effects. It always returns nil, just like Hy's for. Why should Hy's for build a list that you're just going to throw away by expanding into a list comprehension instead?

Kodiologist commented 6 years ago

It should not. I'm sorry if I seemed to suggest otherwise.

Kodiologist commented 4 years ago

I think I forgot to close this after #1626 was merged, but reopen it if it there's still missing features that you want.