Extremely Poor Performance Parsing PHP

wspeirs commented 7 years ago

I built the lexer and parser from the grammars repo for PHP and I get extremely poor performance (~3s to lex, and 24s to parse) with a simple 80 line PHP program (taken from the Wordpress repo). A simple example of this performance issue can be found here: https://github.com/wspeirs/php_linter

I read the previous posts (#944, #1243) about performance issues and have:

Set the error handler to BailErrorStrategy
Set the prediction mode to PredictionMode.SLL

I'm using Python 3.5 on Linux, and ANTLR4.7:

Python 3.5.2 (default, Nov 17 2016, 17:05:23) 
[GCC 5.4.0 20160609] on linux

ANTLR Parser Generator  Version 4.7

I'm hoping there is just a "make go faster" flag I'm missing somewhere, and I don't have to re-work the entire grammar (especially seeing as how I didn't make it) or that I just found a nasty bug in ANTLR4.

Thanks!

KvanTTT commented 7 years ago

Could you add a file or link of your simple 80-line program? I'll take a look. C# version of grammar works fairly fast.

wspeirs commented 7 years ago

It is in the main.py file as a string in the repo I linked: https://github.com/wspeirs/php_linter/blob/master/main.py

wspeirs commented 7 years ago

<?php
ignore_user_abort(true);
if ( !empty($_POST) || defined('DOING_AJAX') || defined('DOING_CRON') )
    die();
define('DOING_CRON', true);
if ( !defined('ABSPATH') ) {
    /** Set up WordPress environment */
    require_once( dirname( __FILE__ ) . '/wp-load.php' );
}
function _get_cron_lock() {
    global $wpdb;
    $value = 0;
    if ( wp_using_ext_object_cache() ) {
        $value = wp_cache_get( 'doing_cron', 'transient', true );
    } else {
        $row = $wpdb->get_row( $wpdb->prepare( "SELECT option_value FROM $wpdb->options WHERE option_name = %s LIMIT 1", '_transient_doing_cron' ) );
        if ( is_object( $row ) )
            $value = $row->option_value;
    }
    return $value;
}
if ( false === $crons = _get_cron_array() )
    die();
$keys = array_keys( $crons );
$gmt_time = microtime( true );
if ( isset($keys[0]) && $keys[0] > $gmt_time )
    die();
$doing_cron_transient = get_transient( 'doing_cron' );
if ( empty( $doing_wp_cron ) ) {
    if ( empty( $_GET[ 'doing_wp_cron' ] ) ) {
        // Called from external script/job. Try setting a lock.
        if ( $doing_cron_transient && ( $doing_cron_transient + WP_CRON_LOCK_TIMEOUT > $gmt_time ) )
            return;
        $doing_cron_transient = $doing_wp_cron = sprintf( '%.22F', microtime( true ) );
        set_transient( 'doing_cron', $doing_wp_cron );
    } else {
        $doing_wp_cron = $_GET[ 'doing_wp_cron' ];
    }
}
if ( $doing_cron_transient != $doing_wp_cron )
    return;
foreach ( $crons as $timestamp => $cronhooks ) {
    if ( $timestamp > $gmt_time )
        break;
    foreach ( $cronhooks as $hook => $keys ) {
        foreach ( $keys as $k => $v ) {
            $schedule = $v['schedule'];
            if ( $schedule != false ) {
                $new_args = array($timestamp, $schedule, $hook, $v['args']);
                call_user_func_array('wp_reschedule_event', $new_args);
            }
            wp_unschedule_event( $timestamp, $hook, $v['args'] );
            do_action_ref_array( $hook, $v['args'] );
            if ( _get_cron_lock() != $doing_wp_cron )
                return;
        }
    }
}
if ( _get_cron_lock() == $doing_wp_cron )
    delete_transient( 'doing_cron' );
die();

KvanTTT commented 7 years ago

It parsed less than a 0.5 sec on C# SHarwerll runtime. Unfortunately, I'm not working with Python runtime and can not help you. Maybe it's related to bugs and defects in Python runtime.

ericvergnaud commented 7 years ago

Hi, Python is 30x slower than Java/C#. If you need performance simply don't use Python. Eric

wspeirs commented 7 years ago

@ericvergnaud is there something inherently different between the way Python runtime works and the C# runtime? I struggle to believe that everything is working as expected when it takes 0.5s with C# and 24s with Python. What inside the runtime would cause it to be so slow in Python, but not in C#?

ericvergnaud commented 7 years ago

The tests demonstrate that everything works as expected. That said, the 30x slower is an average i.e. the performance ratio varies from one area to another. Antlr relies enormously on maps, and it seems the JVM and the CLR are very good at optimizing these.

wspeirs commented 7 years ago

I started digging into the profile a bit:

ncalls   tottime    percall    cumtime    percall    filename:lineno(function)
1254773  2.98       0          19.84      0.007      ParserATNSimulator.py:1135(closure_)
5696573  1.509      0          2.349      0          {built-in method builtins.hash}
1535239  1.174      0          1.174      0          ATNConfig.py:25(__init__)
1747037  1.149      0          1.516      0          PredictionContext.py:134(__eq__)
295265   1.105      0          5.102      0          PredictionContext.py:444(mergeArrays)
1595582  1.062      0          4.481      0          ParserATNSimulator.py:1365(getEpsilonTarget)
1270322  1.006      0          19.843     0.007      ParserATNSimulator.py:1093(closureCheckingStopState)

builtins.hash is clearly called a lot, but only accounts for 1.5s. Looking at cum time, it seems that something to do with the closure in ParserANTSimulator.py is taking up all the time:

ncalls   tottime    percall    cumtime    percall    filename:lineno(function)
97       0.002      0          24.52      24.52      {built-in method builtins.exec}
1544     0.206      0          21.744      0.014     ParserATNSimulator.py:290(adaptivePredict)
1544     0.011      0          20.476      0.013     ParserATNSimulator.py:382(execATN)
1126     0.013      0          20.448      0.018     ParserATNSimulator.py:498(computeTargetState)
2936     0.002      0          19.845      0.007     ParserATNSimulator.py:1087(closure)
1270322  1.006      0          19.843      0.007     ParserATNSimulator.py:1093(closureCheckingStopState)
1254773  2.98       0          19.84       0.007     ParserATNSimulator.py:1135(closure_)
1126     0.08       0          19.059      0.017     ParserATNSimulator.py:659(computeReachSet)
56       0.001      0          13.494      0.75      PHPParser.py:3930(nonEmptyStatement)
462817   0.78       0          11.063      0         ATNConfigSet.py:71(add)

This spark any ideas with anyone?

ericvergnaud commented 7 years ago

You will see a very similar profile in Java and C# i.e. most of the time is spent in ParserATNSimulator. If you look at the last line, you'll see that 11s are spent in ATNConfigSet.add, which is in line with what I wrote earlier (should have wrote maps and sets).

antlr / antlr4

Extremely Poor Performance Parsing PHP #1873