Open Technologicat opened 6 years ago
I don't see a compelling argument here.. just "would be really nice"?
The argument is that currently, it's not possible to do certain things - such as, in the use case above, define a λ macro that allows default values for its arguments, or allows *args
and/or **kwargs
.
The point of this macro is to reduce the amount of typing - not for lambda
itself, but shortening lambda arg0, ...: begin(body0, ...)
to λ(arg0, ...)[body0, ...]
- so that there is no need to type out the begin()
, making it more convenient to write multi-expression lambdas.
The example is perhaps a bit silly in that I have no idea if I'll ever use this particular macro in production code - it's so unpythonic that it borders on the limit of good taste. At least I won't use it if there is no way to give defaults to arguments, and it doesn't support *args
or **kwargs
. :)
I can try to find a better example where the feature is needed; this is just where I first noticed the missing feature, so I thought I'd open a ticket for discussion before I forget.
I don't know if it makes the use case any more compelling, but https://github.com/Technologicat/unpythonic/commit/b51337b2f5615afd44c82390fb40c17af6bd17e5 makes λ
a first-class citizen that can have not only multiple body expressions, but also its own local variables.
To become really useful, λ
would need to handle default values for arguments, and *args
and **kwargs
- i.e. have all the args-handling features of the regular lambda
- but currently this can't be done.
I think I could take a look at what it would take to support named args in MacroPy (to support default values for args in λ
), but *args
and **kwargs
are probably better left out until there is no need to support Python 3.4.
Here, I made a first cut of this: https://github.com/Technologicat/macropy/commit/653b2d2292215b1ff9aff0a0c75bc7085ad8b6b5
At least all tests still pass, so I probably didn't break much :3
Here's also an updated λ that uses the new mechanism, for declaring default values: https://github.com/Technologicat/unpythonic/commit/4e2a28c9c67244cbdd9aa3ab096ce7dc1a5abb09
How it works:
There is a new magic in **kw
, called kwargs
. It gets the named arguments given to the macro invocation (if a Call
), as an OrderedDict
. (A better name is welcome, but OTOH kwargs
is the usual pythonic partner of args
.) The key is an str
, the value is an AST node.
The old kwargs
field in MacroData
was only used to save the assignment target of a with
block; this is now saved in the new field extrakws
. This split was done to isolate the user-given named args from the **kw
magic dictionary, which MacroPy uses internally.
OrderedDict
was chosen because the keywords
field of a Call
is a list; there may be important information encoded in the ordering, so we should preserve it.
As for *args
and **kwargs
, my suggestion is to ignore that part of my original post; the problem disappears once we upgrade requirements to Python 3.5. Then the extra given arguments are absorbed into new specially formatted items in args
and keywords
fields of a Call
. With the present addition, the mechanism we have now can handle both.
Anyway, named args for macros are now here - thoughts?
[edit] clarify why OrderedDict; mention data types of key and value. [edit2] fix silly mistakes.
Re-checking PEP 448, the proposed first-cut solution does need a small revisit after upgrading to Python 3.5, because multiple *
and **
items may then appear in the same call.
Multiple *
pose no problem. The code that already exists in MacroPy can handle the Starred
nodes just like any other arg, and let the macro do what it wants with them, since the macro asked for arguments. ;)
Supporting multiple **
requires perhaps dropping the OrderedDict
currently proposed here, and just passing through a list of keyword
items. Then let the macro do what it wants with them.
Ah, well, second cut:
https://github.com/Technologicat/macropy/commit/ddd9d7545eeac259dcaf06c08be286b3667addfd https://github.com/Technologicat/unpythonic/commit/80af4b8fe692fd78aeaf70f3759fe7be0d2b7581
Ditched the OrderedDict
in favor of just passing through the list of keyword
objects. The advantage, beside better 3.5 support, is that (in PG's famous? words) the abstraction is so thin it's practically transparent - the user can now use the Green Tree Snakes docs to understand what the magic kwargs
contains.
Now, thoughts? :)
Thanks Juha,
but your solution and in general all the situation leaves me more perplexed... I've opened the door to calling the macro with any positional argument or keyword with transform()
(even it's still to be refined to give the parameters injected by the machinery protection against being shadowed by those specified by the user) and you propose to augment this somewhat arcane way of passing arguments (that is now the args
parameter to the macro for me) that the macro implementer has to parse on his own, maybe with the complication of implementing support for multiple runtime versions...It doesn't seem the right thing to do.
I would rather prefer passing those parameters and keywords as real python objects, not some AST trees, but when expansion happens there nothing running yet so this would be working only for parameters and keywords bound to literals or pure expressions.... I need to think over it and to see some real example... isolated.
Anyway please open a PR with your code. I mean, move your commits to another branch (one per PR) and open a PR from it or your code will not be commentable.
Please post here some example of your lambda macro, with comments so that I can understand what it's meant to do without reading a ton of code
Thanks for the heads-up, I'll make my code commentable and follow up with a PR for discussion.
IMHO, args as an AST is a feature, not a bug; as you said, it's before run-time, so nothing exists yet. Leaving it to each macro to decide what to do with the input ASTs sounds to me it's exactly within the job description of a macro.
Why some form of args
- as a minimal example, consider:
@macros.expr
def let(tree, args, **kw): # args: sequence of ast.Tuple: (k1, v1), (k2, v2), ..., (kn, vn)
names = [k.id for k, _ in (a.elts for a in args)]
values = [v for _, v in (a.elts for a in args)]
lam = q[lambda: ast_literal[tree]]
lam.args.args = [arg(arg=x) for x in names]
return q[ast_literal[lam](ast_literal[values])]
@macros.expr
def letseq(tree, args, **kw):
if not args:
return tree
first, *rest = args
return let.transform(letseq.transform(tree, *rest), first)
Note .transform
killed all boilerplate to write those macro definitions, which is excellent. Usage:
let((x, 1),
(y, 2))[
print(x + y)]
letseq((x, 1),
(x, x+1))[
print(x)]
The new identifiers are declared as bare names - being able to do this relies on the fact that the input is an AST. Sure, we could place the bindings at the beginning of the tree
:
let[((x, 1),
(y, 2)),
print(x + y)]
but a separate bindings section looks more readable.
Why some form of named args - it would let us do this:
let(x=1,
y=2)[
print(x + y)]
which looks more pythonic. It also allows neat new stuff like args with default values in λ
, but I now think let
is overall a better example.
Finally, this particular kwargs
hack fixes an asymmetry in the API; if you wrote mac(x=5)[...]
, which is very pythonic, it would be silently ignored, whereas mac(5)
would place the 5
into args
(as a Num
).
Whether an args
-like arcane mechanism is needed at all is another question. If it can be axed altogether, that would simplify things. I didn't see this angle before.
In the context of the new .transform
, do you have a proposal (idea, not code) on how to handle named args from the use site? Normal run-time code obviously won't call mac.transform(tree, *args, **kwargs)
; it will invoke the macro as mac(a0, ..., an, k0=v0, ..., km=vm)[...]
(hypothetical syntax if named args are allowed).
I think some isolation is needed; it is perfectly valid to define a let
variable called gen_sym
or similar, and it shouldn't conflict with MacroPy internals. Similarly, the let
should not always bind a gen_sym
, just because that name happens to exist in **kw
. For let
(and similarly for λ
), there needs to be a way to tell apart user-given vs. MacroPy internal kwargs. Shadowing is only a partial solution, ideally both definitions (if present) should be accessible in the macro code.
Since I promised "neat new stuff", here's also an example on λ (all safeties stripped):
@macros.expr
def λ(tree, args, kwargs, **kw): # <-- requires the kwargs hack
withdefault_names = [k.arg for k in kwargs]
defaults = [k.value for k in kwargs]
names = [k.id for k in args] + withdefault_names
newtree = do.transform(tree)
lam = q[lambda: ast_literal[newtree]]
lam.args.args = [arg(arg=x) for x in names]
lam.args.defaults = defaults # for the last n args
return lam
@macros.expr
def do(tree, **kw):
... # beside the point; see unpythonic.syntax
Usage:
echo = λ(myarg="hello")[print(myarg),
myarg]
assert echo() == "hello"
assert echo("hi") == "hi"
count = let((x, 0))[
λ()[x << x + 1,
x]]
assert count() == 1
assert count() == 2
myadd = λ(x, y)[print("myadding", x, y),
localdef(tmp << x + y),
print("result is", tmp),
tmp]
assert myadd(2, 3) == 5
The essential point is, kwargs
is used to capture keyword
nodes from the use site, where arg
is the name and value
is the AST node representing the value. These can be abused as an args-with-defaults declaration. (In a call, named args after positionals; in a function declaration, args with defaults last. Isomorphic, or close enough.)
No *args
or **kwargs
support yet, but in 3.5, not difficult to add. Just sanity-check there is at most one of *
and **
each, and check placement w.r.t. other args. Extending this slightly could give support also for only-by-name args.
[edit] The count
example requires let
from unpythonic.syntax
; the one posted above is there called simple_let
and doesn't support assignments. Supporting an "assignment expression" requires some trickery which is beside the point here.
I just obsoleted my λ; this is much more pythonic, not to mention less brittle:
@macros.block
def multilambda(tree, **kw):
@Walker
def transform(tree, *, stop, **kw):
if type(tree) is not Lambda or type(tree.body) is not List:
return tree
bodys = Tuple(elts=tree.body.elts, ctx=Load())
bodys = copy_location(bodys, tree)
stop() # don't recurse over what do[] does
bodys = transform.recurse(bodys) # but recurse over user code
tree.body = do.transform(bodys)
return tree
yield transform.recurse(tree)
Usage:
with multilambda:
echo = lambda x: [print(x), x]
assert echo("hi there") == "hi there"
count = let((x, 0))[
lambda: [x << x + 1,
x]]
assert count() == 1
assert count() == 2
t = lambda: [[1, 2]]
assert t() == [1, 2]
The pythonic let
use case still stands; there named arguments would be useful.
sorry -- this has little to do with the interesting discussion of late, but i wonder about the contrived code example in the first comment
i don't understand how this works in 2 ways, even in custom MacroPy
or unpythonic
.
x
and y
are not known names at the point λ
is called. the comment about needing named arguments is later in the code, so surely λ(x, y)
works but i don't know how.
myadd = λ(x, y)[print(x, y), x + y]
λ
is a function called with two un-assigned symbols which returns a function-like callable indexable object, this is fine.
but as well print(x, y)
is unavoidably evaluated as soon as it is seen by the Python interpreter and so the slice / indexing object becomes [None, x + y]
.
Perhaps from multilambda import macros, λ
puts all code after it in a big try...except NameError:
block?
And the import also enables some "lazy-loading" feature of the Python interpreter so that print(x, y)
is not evaluated to None immediately? Or did you mean to write lambda: print(x, y)
?
Cat: it's a macro thing. :)
Roughly speaking, a macro intercepts and transforms code before the rest of the interpreter even sees it. It just needs to be valid syntactically, so that Python's parser can convert the source code text to an AST. MacroPy then hands over (relevant parts of) this AST to macros, to be transformed into a new AST. Normal run-time interpretation starts only after all macros have "run" (been expanded). This gives some flexibility normal code doesn't have.
The λ is a macro; it looks like a function call, but it's subtly different. The [...]
are part of the macro syntax in MacroPy; they delimit the body, i.e. the main stuff that goes in. The (...)
delimit macro arguments (args) - these are also ASTs, just placed inside (...)
instead of [...]
.
Both args and body are sent to the same "call" of the macro. Hence, λ(arg0, ...)[body0, ...]
is just one operation, not two. The undefined names are never seen by the interpreter - they are transformed into argument names in a lambda.
Now, unpythonic does a bit of magic - the body of λ gets wrapped with an unpythonic.seq.do
, which (in its normal runtime code part) takes a list of regular old Python lambdas, and runs them one by one. (There are some technical details to support variables local to the "do", beside the point here.)
The "do" macro, which is what the λ macro actually inserts, then makes this a bit easier to use, by taking the code entered by the user, and wrapping each item in a lambda - automatically - so that execution is delayed until the underlying unpythonic.seq.do
actually runs.
Hope this helps :)
[edit] fix text formatting
Yes, it helps very much! I sort of thought Python tries to resolve names at parse time and complain at runtime, but this is all very interesting to learn :)
Cat: AFAIK, Python basically resolves everything at runtime. Only reserved words such as import
and def
always mean what we expect them to; almost anything else can be overridden (either by rebinding the original or shadowing it by something more local) from anywhere at any time. :)
I've sometimes tripped over this myself, when writing a context manager, declaring
def __exit__(self, type, value, traceback):
...
and then wondered why a call to the built-in type()
from inside that method fails to work. :)
On some occasions, being able to pass named arguments to macros would be useful.
Use case, related to the multilambda macro in unpythonic (rackety lambda with implicit begin, for Python):
(Usage, implementation.)
For the same use case,
*args
and**kwargs
support would be really nice. :)Thoughts?
[edit] update link. [edit2] these links are now obsolete; the silly λ macro has been removed.