Closed matthiasgoergens closed 6 months ago
We aren't interested in changing the language, except for very minor clarifications, and fixing loopholes.
How would using S-expressions help performance?
Sorry, forget about S-Expressions, they were just a silly example.
What I had in mind was something closer to:
from __syntax__ import strict_checking, idempotent, pure, reorder
@strict_checking
def h():
some_code()
some_more_code()
def f():
# Allow the compiler to re-order stuff in this block.
with reorder:
x = foo()
y = bar()
# But no re-ordering afterwards:
return fnord(x, y)
class SomeDataStructure:
@idempotent
def set_and_get(self, i, x):
"""Compiler is allowed to cache this function, or to execute it again multiple times, if that's cheaper."""
self.foo[i] = x
return self.some_other_expensive_computation()
@pure
def g():
"""Compiler is allowed (but not required) to defer computation of y.
Or ever skipping computation of g() completely, if its return value is never used.
Alternatively, the compiler is also allowed to call g() multiple times, if that's cheaper than caching.
"""
x = some_expensive_computation()
y = some_other_expensive_computation()
if x > 3:
return y
else:
return None
In a vanilla implementation, reorder
, idempotent
, strict_checking
and pure
etc are allowed to do nothing. An optimizing implementation can make extra assumptions or make use of the declared properties (but has to behave as-if it's the vanilla implementation.)
You can get some of what idempotent
would be doing via the existing lru_cache
. But not everything.
I mostly had this kind of thing in mind for experiments, but also to be able to write more of the standard library in a restricted subset of vanilla Python that we can hopefully run much faster.
(If that restricted subset never gets exposed to users, then it's purely a performance improvement. But I can imagine, some adventurous users might be keen.)
Also a possibility, similar to inline assembly:
from __syntax__ import bytecode
@bytecode
def myfunc(alist)
2 0 LOAD_GLOBAL 0 (len)
2 LOAD_FAST 0 (alist)
4 CALL_FUNCTION 1
6 RETURN_VALUE
Or you can add an unsafe mode that disables eg bounds checking for a specific block, in order to speed things up.
I am not sure whether any of these are good ideas, nor how the syntax should look like.
This proposal sounds quite similar to the approach Cinder uses for Static Python. For example, this is a faster way of looking up the length of a list:
import __static__
from typing import List
def test(l: List[int]) -> int:
return len(l)
In this case, test
contains Cinder's FAST_LEN
bytecode, which simply looks up ob_size
in the interpreter (quite a lot faster than going through builtins.len()
. Explorer Link
The downside of this approach is that it requires someone to knowingly write code to take advantage of these extension features (i.e, it doesn't speed up existing code).
Thanks for pointing to the prior work! I'll investigate.
Yes, being able to mix and match something like Static Python is exactly one of the use-cases I had in mind for this kind of flexibility.
The downside of this approach is that it requires someone to knowingly write code to take advantage of these extension features (i.e, it doesn't speed up existing code).
I was thinking of using this approach in libraries (including the standard library). That would speed up existing code that makes use of these libraries.
Also a possibility, similar to inline assembly:
You might be interested in a small project of mine that allows you to do this: https://github.com/brandtbucher/hax.
3.11 support isn’t complete yet, but other versions work fine.
https://news.ycombinator.com/item?id=32785012 has some interesting application for something like this.
And by 'interesting' I mean horrifying.
In the spirit of bringing up not fully formed ideas for discussion:
Python the language is defined via its syntax. But as far as I know, CPython translate everything to bytecode fairly early on, and then doesn't care too much about the surface level syntax. (Apart from error messages etc.)
I also know that
from __future__ import <something>
can change that syntax for specific modules.Would it be possible to open up that mechanism more, so that it's available for experiments?
Just for fun, as a really simple and naive idea, it would be interesting to try writing Python with S-Expression syntax. (Like Scheme. This wouldn't help with performance at all. Just a fun example.)
More seriously, it would be interesting to introduce a syntactic form that allows the compiler to eg re-order statements within a specifically marked block.
Or you might want to forbid certain constructs in certain places, to enable certain optimizations to be done safely.
In any case, my question is: is my understanding right, that just swapping out how bytecode gets produced for specific files would be enough to enable this? And how general is the existing
from __future__ import
machinery?