peggyjs / peggy

Peggy: Parser generator for JavaScript
https://peggyjs.org/
MIT License
914 stars 65 forks source link

Performance javascript grammar #409

Closed juanfranblanco closed 1 year ago

juanfranblanco commented 1 year ago

Hi, I have been using a variant of the javascript grammar for a while, but I have found a performance issue with multiple calls in a single statement.

The example function is:


    function test(
         x,
         y,
         z,
         m,
         a,
         b,
         c
    )  {

            var monkey = call(
                call2(
                    object1.call3(
                        "xxxxx",
                        call4(),
                        call5(
                            object2.call6(
                                call7(
                                    "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
                                ),
                                x,
                                y,
                                z,
                                m++,
                                s
                            )
                        )
                    )
                ),
                a,
                b,
                c
            );
    }

as you can see this is due to many inner calls, up to call6 the performance is around 1 second the it might be more than 2 seconds..

Do you have any ideas on how to improve this?

hildjj commented 1 year ago

Can you isolate this to a small grammar input that shows the same behavior? These sorts of things are sometimes ReDoS-style issues.

juanfranblanco commented 1 year ago

Yes that might be the case although is rather complex to reduce the grammar we have CallExpressions https://github.com/peggyjs/peggy/blob/main/examples/javascript.pegjs#L631MethodExpressions etc. Maybe creating a MultiCallExpression might be able to help.

hildjj commented 1 year ago

If you turn on tracing, you might see the problem, with the same piece of input being parsed a lot.

juanfranblanco commented 1 year ago

@hildjj Fixed it on my end, but if anyone encounters this adding caching (or the lack of) was one of the culprits, it might have been on the migration from peg that the setting was removed. Thank you very much.

hildjj commented 1 year ago

Did it work better with caching on, or with caching off? We might not have quite enough thought given to how caching interacts with the newer features.

juanfranblanco commented 1 year ago

Caching on, it made a huge difference.

hildjj commented 1 year ago

Nod. You should still take a look at your grammar for ReDoS-style issues then. Places where you have (a*)+ for example.