Open Stanley5249 opened 1 month ago
Generally, performance changes like this add too much complexity to the code to be desirable -- there's also nothing stopping you from just using your implementation on your own. With that being said, maybe this is a good idea, but I think this needs to be discussed on DPO.
cc @Eclips4, as you're now an AST expert :)
I've just had a look at the benchmarks but something feels wrong. The benchmarks say that ast.literal_eval
takes for parsing a constant:
constant
- literal_eval : 0.6472s
- stack_literal_eval : 0.5110s
I don't understand how it can be that slow. Could you test your implementation using the command-line, e.g., ./python -m timeit -s 'import ast' 'ast.literal_eval("42")'
(for instance, on a debug build, I get "50000 loops, best of 5: 4.53 usec per loop")? Or explain how those numbers were obtained? In addition, could you indicate whether the build is a debug, release, or a PGO(+LTO maybe) build please?
@picnixz
Sorry for the confusion. The benchmark results I provided represent the total time for 300,000 runs across both methods and all test cases.
Additionally, I'm using Python 3.12.6 release build.
Ok, so the numbers are better. I'm still wondering whether the cost of maintaining a stack-based implementation is better. Usually literal_eval
is used for relatively flat statements I think but is it possible to also see the memory usage?
By the way, there is a comment saying that "a complex expression can overflow the C stack and cause a crash." Does your implementation also suffer from a similar drawback?
Finally, I'd prefer having benchmarks using pyperf
or hyperfine
so that we also know the standard deviation of the benchmarks (if you're willing to push for this feature, we like having those kind of benchmarks).
The issue of "a complex expression can overflow the C stack and cause a crash" occurs with the compile function. You can reproduce it with ast.literal_eval("+0" * 10**4)
, so it does suffer from this drawback.
>>> import ast
>>> ast.literal_eval("+0" * 10**4)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python\Python312\Lib\ast.py", line 66, in literal_eval
node_or_string = parse(node_or_string.lstrip(" \t"), mode='eval')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python\Python312\Lib\ast.py", line 52, in parse
return compile(source, filename, mode, flags,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded during ast construction
I'll proceed with the benchmarks.
@picnixz
I just want to complain that pyperf is really not easy to use when generating this table. Here it is:
Benchmark | literal_eval | stack_literal_eval |
---|---|---|
constant | 2.54 us | 1.86 us: 1.36x faster |
unary | 3.53 us | 3.07 us: 1.15x faster |
complex | 4.28 us | 3.94 us: 1.09x faster |
empty_tuple | 3.16 us | 2.68 us: 1.18x faster |
empty_list | 3.24 us | 2.67 us: 1.21x faster |
empty_set | 3.80 us | 3.17 us: 1.20x faster |
empty_dict | 3.60 us | 3.32 us: 1.08x faster |
tuple | 12.7 us | 13.8 us: 1.09x slower |
list | 12.4 us | 12.9 us: 1.04x slower |
set | 12.9 us | 12.7 us: 1.01x faster |
dict | 23.3 us | 23.0 us: 1.01x faster |
nested | 73.0 us | 75.9 us: 1.04x slower |
Geometric mean | (ref) | 1.09x faster |
For reproduction, see here.
I just want to complain that pyperf is really not easy to use when generating this table. Here it is:
That's why I do it only with the text table and not the markdown one :D (if you're talking about something else, like how to generate all rows, I agree that it's a bit painful to do. I use a bash script for that though).
Overall the improvement are only interesting for constants or empty objects. I don't think this is sufficient enough compared to the maintenance cost (also, lists and sets are quite important IMO). Someone else's opinion would be welcomed (maybe @JelleZijlstra?)
Thanks for this contribution. The current implementation of ast.literal_eval
relies on nested functions while your stack-based version does not; I wonder if that makes a significant difference. Your version also seems to be missing a special case for ast.Expression
, which is present in the current implementation.
Like some previous commenters, I'm concerned that more complex cases (such as lists) apparently get slower. I don't know how most people use literal_eval(), but I suspect that most users pass something more complex than a single constant.
I've switched to a recursion-based approach, similar to the original implementation but much cleaner, resulting in increased performance.
The ast.Expression
special case has also been fixed.
You can check out the changes in this branch: Recursion Branch.
Benchmark | literal_eval | stack_literal_eval |
---|---|---|
constant | 2.62 us | 1.75 us: 1.49x faster |
unary | 3.55 us | 2.61 us: 1.36x faster |
complex | 4.39 us | 3.36 us: 1.30x faster |
empty_tuple | 3.16 us | 2.30 us: 1.37x faster |
empty_list | 3.29 us | 2.32 us: 1.42x faster |
empty_set | 3.77 us | 2.84 us: 1.33x faster |
empty_dict | 3.59 us | 2.67 us: 1.34x faster |
tuple | 12.8 us | 11.6 us: 1.10x faster |
list | 12.7 us | 11.4 us: 1.11x faster |
set | 12.9 us | 11.8 us: 1.10x faster |
dict | 22.4 us | 21.3 us: 1.05x faster |
nested | 74.4 us | 68.6 us: 1.08x faster |
Geometric mean | (ref) | 1.25x faster |
Feature or enhancement
Proposal
I have implemented a stack-based version of
ast.literal_eval
and conducted benchmarks to compare its performance with the existing implementation. The results indicate that the stack-based version may improve performance on non-nested expressions, although it is slightly slower on nested expressions.Benchmark Results
The benchmarks were conducted on an Intel(R) Core(TM) Ultra 9 185H. Below are the results comparing
ast.literal_eval
andstack_literal_eval
:Question
I wonder if this change is desirable?
Code
See here.
Related Issue
This work is related to the closed issue #75934.
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response