Stanley5249 commented 1 month ago

Feature or enhancement

Proposal

I have implemented a stack-based version of ast.literal_eval and conducted benchmarks to compare its performance with the existing implementation. The results indicate that the stack-based version may improve performance on non-nested expressions, although it is slightly slower on nested expressions.

Benchmark Results

The benchmarks were conducted on an Intel(R) Core(TM) Ultra 9 185H. Below are the results comparing ast.literal_eval and stack_literal_eval:

benchmark

Question

I wonder if this change is desirable?

Code

See here.

Related Issue

This work is related to the closed issue #75934.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

ZeroIntensity commented 1 month ago

Generally, performance changes like this add too much complexity to the code to be desirable -- there's also nothing stopping you from just using your implementation on your own. With that being said, maybe this is a good idea, but I think this needs to be discussed on DPO.

cc @Eclips4, as you're now an AST expert :)

picnixz commented 1 month ago

I've just had a look at the benchmarks but something feels wrong. The benchmarks say that ast.literal_eval takes for parsing a constant:

constant

literal_eval : 0.6472s

stack_literal_eval : 0.5110s

I don't understand how it can be that slow. Could you test your implementation using the command-line, e.g., ./python -m timeit -s 'import ast' 'ast.literal_eval("42")' (for instance, on a debug build, I get "50000 loops, best of 5: 4.53 usec per loop")? Or explain how those numbers were obtained? In addition, could you indicate whether the build is a debug, release, or a PGO(+LTO maybe) build please?

Stanley5249 commented 1 month ago

@picnixz

Sorry for the confusion. The benchmark results I provided represent the total time for 300,000 runs across both methods and all test cases.

Additionally, I'm using Python 3.12.6 release build.

picnixz commented 1 month ago

Ok, so the numbers are better. I'm still wondering whether the cost of maintaining a stack-based implementation is better. Usually literal_eval is used for relatively flat statements I think but is it possible to also see the memory usage?

By the way, there is a comment saying that "a complex expression can overflow the C stack and cause a crash." Does your implementation also suffer from a similar drawback?

Finally, I'd prefer having benchmarks using pyperf or hyperfine so that we also know the standard deviation of the benchmarks (if you're willing to push for this feature, we like having those kind of benchmarks).

Stanley5249 commented 1 month ago

The issue of "a complex expression can overflow the C stack and cause a crash" occurs with the compile function. You can reproduce it with ast.literal_eval("+0" * 10**4), so it does suffer from this drawback.

>>> import ast
>>> ast.literal_eval("+0" * 10**4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python\Python312\Lib\ast.py", line 66, in literal_eval
    node_or_string = parse(node_or_string.lstrip(" \t"), mode='eval')
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python\Python312\Lib\ast.py", line 52, in parse
    return compile(source, filename, mode, flags,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded during ast construction

I'll proceed with the benchmarks.

Stanley5249 commented 1 month ago

@picnixz

I just want to complain that pyperf is really not easy to use when generating this table. Here it is:

Benchmark	literal_eval	stack_literal_eval
constant	2.54 us	1.86 us: 1.36x faster
unary	3.53 us	3.07 us: 1.15x faster
complex	4.28 us	3.94 us: 1.09x faster
empty_tuple	3.16 us	2.68 us: 1.18x faster
empty_list	3.24 us	2.67 us: 1.21x faster
empty_set	3.80 us	3.17 us: 1.20x faster
empty_dict	3.60 us	3.32 us: 1.08x faster
tuple	12.7 us	13.8 us: 1.09x slower
list	12.4 us	12.9 us: 1.04x slower
set	12.9 us	12.7 us: 1.01x faster
dict	23.3 us	23.0 us: 1.01x faster
nested	73.0 us	75.9 us: 1.04x slower
Geometric mean	(ref)	1.09x faster

For reproduction, see here.

picnixz commented 1 month ago

I just want to complain that pyperf is really not easy to use when generating this table. Here it is:

That's why I do it only with the text table and not the markdown one :D (if you're talking about something else, like how to generate all rows, I agree that it's a bit painful to do. I use a bash script for that though).

Overall the improvement are only interesting for constants or empty objects. I don't think this is sufficient enough compared to the maintenance cost (also, lists and sets are quite important IMO). Someone else's opinion would be welcomed (maybe @JelleZijlstra?)

JelleZijlstra commented 1 month ago

Thanks for this contribution. The current implementation of ast.literal_eval relies on nested functions while your stack-based version does not; I wonder if that makes a significant difference. Your version also seems to be missing a special case for ast.Expression, which is present in the current implementation.

Like some previous commenters, I'm concerned that more complex cases (such as lists) apparently get slower. I don't know how most people use literal_eval(), but I suspect that most users pass something more complex than a single constant.

Stanley5249 commented 3 weeks ago

I've switched to a recursion-based approach, similar to the original implementation but much cleaner, resulting in increased performance.

The ast.Expression special case has also been fixed.

You can check out the changes in this branch: Recursion Branch.

Timeit Results

benchmark

Pyperf Results

Benchmark	literal_eval	stack_literal_eval
constant	2.62 us	1.75 us: 1.49x faster
unary	3.55 us	2.61 us: 1.36x faster
complex	4.39 us	3.36 us: 1.30x faster
empty_tuple	3.16 us	2.30 us: 1.37x faster
empty_list	3.29 us	2.32 us: 1.42x faster
empty_set	3.77 us	2.84 us: 1.33x faster
empty_dict	3.59 us	2.67 us: 1.34x faster
tuple	12.8 us	11.6 us: 1.10x faster
list	12.7 us	11.4 us: 1.11x faster
set	12.9 us	11.8 us: 1.10x faster
dict	22.4 us	21.3 us: 1.05x faster
nested	74.4 us	68.6 us: 1.08x faster
Geometric mean	(ref)	1.25x faster

python / cpython

Stack-based `ast.literal_eval` Implementation #124503

Feature or enhancement

Proposal

Benchmark Results

Question

Code

Related Issue

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Timeit Results

Pyperf Results