gilch / hebigo

蛇語(HEH-bee-go): An indentation-based skin for Hissp.
https://github.com/gilch/hissp
Mozilla Public License 2.0
24 stars 3 forks source link

class macro and decorator syntax #7

Closed gilch closed 5 years ago

gilch commented 5 years ago

How should the class: macro look/work? I want Hebigo to look and feel familiar to Pythonistas. But at the same time, Hissp is targeting a functional subset of Python to reduce incidental complexity. It can't be simpler if it works exactly like Python.

gilch commented 5 years ago

Starting with a Python example like this:

class Foo(Bar):
    """docstring"""
    def __init__(self, x):
        self._x = x
    @property
    def x(self):
        return self._x

Option 1, explicit pairs

class: Foo: Bar
    __doc__:"""docstring"""
    __init__:lambda: :: self x
        setattr: self "_x" x
    x:property:lambda: :: self
        self._x

This looks a lot like the dict that it is. Lambdas don't get their name metadata, even though the left of the pair names it. Maybe we want an additional macro that expands to a pair?

Option 2, rewrite def: in class body to the dict pairs.

class: Foo: Bar
    def: __doc__ """docstring"""
    def: __init__: self x
        setattr: self "x" x
    def:
        x property:lambda: :: self
            self._x

Methods look very Python. Class attributes not so much, but it's consistent with the globals. Docstring could be special-cased, but not other class attributes. The method lambdas could get their name metadata. Decorators look pretty awkward. Because this is expanding to a dict, you can't access any attrs set in the body later in the body, so the decorator can't be after the method either.

Option 3, use := keyword to designate pairs. It looks like the new walrus operator, but isn't.

class: Foo: Bar
    __doc__ := """docstring"""
    __init__ := lambda: :: self x
        setattr: self "_x" x
    x := property:lambda: :: self
        self._x

This doesn't look much better than the explicit pairs, and we can't use helper macros that expand to pairs now. But anything not adjacent to a := can be executed for side effects. This doesn't seem that useful without a namespace. And we could have used a progn for side effects before.

gilch commented 5 years ago

Option 4, allow a mix of 2 and 3.

class: Foo: Bar
    __doc__ =: """docstring"""
    def: __init__: self x
        setattr: self "_x" x
    :@ property
    def: x: self
        self._x

With special-cased docstring, and a new class attr.

class: Foo: Bar
    """docstring"""
    __slots__ := "_x"
    def: __init__: self x
        setattr: self "_x" x
    :@ property
    def: x: self
        self._x

Looks very Python. But it's over-complicating the macro maybe. Also, the keyword decorators make no sense at the top level, so we lose the symmetry there.

gilch commented 5 years ago

Option 5, control words :=, :def, :@.

class: Foo: Bar
    """docstring"""
    __slots__ := "_x"
    :def __init__: self x
        setattr: self "_x" x
    :@ property
    :def x: self
        self._x

It's kind of a DSL now. Again we lose the top-level symmetry, now with def as well as decorators.

gilch commented 5 years ago

Option 6, explicit pairs, but def expands to pair. Decorators go inside

class:
    :@ some_class_decorator
    :@ another_one
    Foo: Bar
    """docstring"""
    __slots__:"_x"
    def: __init__: self x
        setattr: self "_x" x
    def:
        :@ property
        x: self
        self._x

Decorators could go inside def: like this even at the top level, so we'd keep symmetry. The class/function/method names get a little harder to find. With syntax highlighting I think you'd get used to it. Editor searches for class/function/method is likewise more difficult. Regex could help a bit. Simple code folding based on indents would also probably hide the name.

gilch commented 5 years ago

~Option 7~, Decorators go inside the args tuple

class:
    Foo:
        :@ some_class_decorator
        :@ another_one
        Bar
    """docstring"""
    __slots__:"_x"
    def: __init__: self x
        setattr: self "_x" x
    def:
        x:
            :@ property
           self
        self._x

This looks worse.

Option 8, Name goes outside the args tuple when there are decorators.

class: Foo
    :@ some_class_decorator
    :@ another_one
    :: Bar
    """docstring"""
    __slots__:"_x"
    def: __init__: self x
        setattr: self "_x" x
    def: x
        :@ property
        :: self
        self._x

Code folding, searching, and visual inspection should find the names more easily this way, but I'd almost rather use lambdas at this point.

gilch commented 5 years ago

Option 9,

class: :@ some_class_decorator: another_one: Foo: Bar
    """docstring"""
    __slots__:"_x"
    def: __init__: self x
        setattr: self "_x" x
    def: :@ property: x: self
        self._x

This kind of makes more sense, but what if the decorators have arguments?

class: :@
    :: some_class_decorator: arg1 arg2
        another_one: Foo: Bar
    """docstring"""
    ...

Again it makes sense, but the name is getting lost again. I'd rather not even have decorator syntax at this point, and just call them after. But this doesn't work in a class body that expands to a simple dict. There's no namespace to get it back from. Option 6 seems less bad.

gilch commented 5 years ago

Other considerations:

Do we need classes at all? It's not the style I'd want to encourage for a functional Lisp. But unlike Lissp most of the time, Hebigo does need the operator magic methods.

We have type() and whatever metaclass callables you like already.

You could start with an empty class body and dynamically add attributes, like vanilla Javascript. (Or Smalltalk.)

gilch commented 5 years ago

Option 10, decorators go in body before docstring

class: Foo: Bar
    :@ some_class_decorator
    :@ another_one
    """docstring"""
    __slots__:"_x"
    def: __init__: self x
        setattr: self "_x" x
    def: x: self
        :@ property
        self._x

This works at the top level, keeps easy visual/editor search, and allows folding without hiding the name. The :@ control word can be special cased as easily as the docstring is. It makes a lot of sense in a function def, which is analogous to a key/value def. You assign the "key" (like x: self), which resembles an invocation, to the body "value" like self._x. The decorators appear to be functions modifying the "value".

gilch commented 5 years ago

Going with Option 10's decorators.

The def macro currently creates a global binding, even when nested, like Clojure's defn (it would have to be rewritten inside class macros). This behavior is probably surprising for Pythonistas. Racket's behavior of creating a locally-scoped definition is more Pythonic. Unfortunately, it's also harder to implement. The globals() dict is write-through, but (in Python 3) the locals() dict is not. (Unless it happens to be at the top level anyway.)

Because we're limited to expressions, that means only way left to update locals is in a comprehension, which seems pretty awkward. Introducing new locals, on the other hand, can be done with a lambda.

But the @property decorator requires access to the decorated function again to add a setter or deleter. This isn't possible in a simple dict() call, because the arguments are not in scope yet. This also means that new lambda locals probably aren't good enough to implement something like letrec.

That leaves us with verbose incremental external updates to the class after construction (Javascript-style) or the new walrus operator := from 3.8. Making def a text macro for the walrus seems like a good fit, but that means dropping 3.7 support before 3.8 is even out yet (it's not like Hebigo is ready for release either). We could pretty much expand the class body to a lambda body. I'm still not sure about metaclasses in some cases though.

gilch commented 5 years ago

Metaclasses can have a __prepare__ attribute. So the behavior of getting and setting class attributes during the execution of the class body can be overridden in pretty much arbitrary ways. That leaves us with external updates accessing the namespace via [] or .; or passing the namespace to eval() or exec().

With external updates, the macro could hide top-level writes, but it really can't do that to reads, because the variables may appear in bracketed expressions that can't be rewritten short of parsing AST.

With eval() or exec(), we can control the local namespace, so reads inside bracketed expressions can still work. It's not as simple as passing the result of __prepare__, because the class body and its method still need access to the surrounding lexical scope. Globals are easy. Perhaps a ChainMap starting with the prepared namespace could work for the locals. But simply adding locals() to the chain still isn't good enough if the class definition is nested in more than one function. Locals won't appear in the innermost scope unless they are referenced in the function body directly:

>>> def foo():
...     a=1
...     def bar():
...         b=2
...         print(locals())
...     bar()
...     
>>> foo()
{'b': 2}
>>> def foo():
...     a=1
...     def bar():
...         a
...         b=2
...         print(locals())
...     bar()
...     
>>> foo()
{'b': 2, 'a': 1}

A macro could add these, but can't know they exist. Perhaps we have to restrict class definitions to top-level only. It would eliminate these crazy scoping issues, but I don't like it.

The class macro basically has to do all the steps the Python interpreter does when executing a class body:

I'm not sure how to do all of these.

gilch commented 5 years ago

It's possible to make the compiler put things in locals by defining a function without calling it.

>>> def foo():
...     a=1
...     def bar():
...         lambda: [a, b, c, d, e, f, g]
...         print(eval('a'))
...     bar()
...     
>>> foo()
1
>>> def foo():
...     a=1
...     def bar():
...         print(eval('a'))
...     bar()
...     
>>> foo()
Traceback (most recent call last):
    ...
NameError: name 'a' is not defined

The macro could try to put every symbol in the class definition and every identifier in bracket expressions into the locals() this way, just by lexing the words with a regex. No need for AST. This seems like a hack, but it could work. Obviously, reserved words would have to be excluded.

gilch commented 5 years ago

I've been playing around with more possibilities.

It would be nice if def: could assign to an attribute of something, (like setattr()). This way it's not forced to only use globals() and you could def things directly into _macro_ or a class object. But now the basic form needs a third argument (the namespace object).

def: namespace key value

def: namespace name: args
    body

It looks like this works, but

def: foo bar: x

Does this mean something like setitem(globals(), 'foo', bar(x)) or more like setattr(foo, 'bar', lambda x:())? It's ambiguous. There are a lot of possible ways to deal with this problem.

The simplest would be to add a dot between the names. This is how attributes are accessed, and it's also how they are assigned in Python, so this is sensible.

def: foo bar: x
# This one is the lambda.
def: foo.bar: x

The option most resembling Python would be an namespace anaphor. That way the third argument is whatever _ns_ (or something) happens to be in the current scope. The class macro would set this to its own namespace. You could also !let it :be something else and not have to repeat the argument when defining multiple functions in the same namespace.

def: foo bar: x
# This one is the lambda.
!let: _ns_ :be foo
    def: bar: x
gilch commented 5 years ago

The anaphor has problems. It can cause surprises if someone happens to define one in the global scope. You might think you're defining a global, but are actually defining an attribute.

While you might save words when def:ing multiple attributes, the overhead of the !let doesn't seem worth it for just one.

If there's no _ns_ in scope at all (maybe typical at the top level), trying to use it would raise a NameError.

The def: macro could require this to be defined to work. Options,

The def: macro could just catch the NameError when it happens, but this is a performance hit since it requires an additional lambda and call every time. This is not much worse than a lot of things Hebigo is already doing, honestly. Not really a problem at the top level, but inner def:s might get a noticeable hit.

Hebigo lacks mutable local scope. !let only introduces new variables; it cannot reassign them. Namespaces you can def: into can compensate for this, but they won't automatically nest like normal lexical scope does. This is more likely to cause confusion when the namespace is used implicitly.

gilch commented 5 years ago

I'm currently leaning towards a compromise solution: require an explicit name, but allow an abbreviation, so

def: _macro_.foo: a b c
    ...

could also be written as

!let: _at_ :be _macro_
    def: @foo: a b c
        ...

The namespace anaphor is _at_ (for "attribute"), but it's only used when the name starts with @, so it's an abbreviation for

!let: _at_ :be _macro_
    def: _at_.foo: a b c
        ...

Not much of an abbreviation, but it looks a lot nicer in a hypothetical class: macro. It could also work for decorators. And the class: macro could also assign the new class to an attribute.

class: Foo: Bar Baz : metaclass spam
    :@ glitter
    :@ sparkle: 777
    """
    docstring
    """
    def: @eggs None
    def: @foo: self
        :@ property
        """ doc """
        self._foo
    def: @foo: self val
        :@ @foo.setter  # note double @
        def: self._foo val
    def: @foo: self
        :@ @foo.deleter
        del: self._foo
    class: @Inner: Quux

The class object doesn't exist until after the class is built, so _at_ would have to be set to the dict used to build the class instead. But that would need setitem() instead of setattr(). This is easy to fix with a proxy object wrapping the dict. We can't just use SimpleNamespace because metaclasses might make some other mapping type instead of a plain dict, but Drython already has something that should work.

gilch commented 5 years ago

This form kind of resembles Ruby's sigils. It sort of conflicts with the decorator syntax visually. Any bracketed expression would have to use _at_ instead. This seems acceptable. We could also have a del: macro that works the same way (i.e. del: foo or del: foo.bar or del: @foo).

gilch commented 5 years ago

We can't start a symbol with @ because then it's not a valid identifier. I thought maybe the macro would be able to get to it first and it would be fine as long as we don't emit such a name. But the parser itself will reject it before the macro can get to it. I was a little tempted to adjust the parser, but I think it's just as well. Allowing invalid identifiers to be symbols (without read-time munging) could create incompatibilities with Lissp.

But we could use : instead and make it a control word. (:@foo could also work, but : seems sufficient, since we don't need any other control words in that position. Yet. I'm still considering supporting type annotations, but these are lambdas.)

gilch commented 5 years ago

I discovered that the types.new_class() function will make supporting metaclasses correctly much easier. It also uses the explicit namespace approach in the exec_callback, so I think this approach is correct and preferable to using exec() or locals(). I think I'll also adjust Lissp's deftype and Hissp's FAQ to use this function as well.