lutaml / express-grammar

ANTLR grammar for EXPRESS (ISO 10303-11)
Other
3 stars 2 forks source link

Performance with generated Ruby parser #5

Open ronaldtse opened 4 years ago

ronaldtse commented 4 years ago

From @zakjan :

Parsing using Ruby runtime takes significantly longer than other runtimes. It seems that Ruby runtime performs something inefficient by mistake, is it possible? Or is the slowness related to Ruby interpreter?

The grammar is moderately complex (191 parser rules, 132 lexer rules), available here. Some input files are available online.

See the measurements (avg of 5 runs) below:

274 lines input file available here

Java - 0.391s JS - 0.453s Ruby - 4.722s

1314 lines input file which is structurally the same as the first file, just with additional comments (EmbeddedRemark and LineRemark lexer rules)

Java - 0.395s JS - 0.444s Ruby - 6.205s

14742 lines input file available here

Java - 2.393s JS - 3.945s Ruby - 34m55s (1 run)

27754 lines input file

Java - 1.682s JS - 2.916s Ruby - 2m49s

ronaldtse commented 4 years ago

Tracked in https://github.com/MODLanguage/antlr4-ruby-runtime/issues/9 .

ronaldtse commented 4 years ago

Solution: use https://github.com/camertron/antlr4-native-rb or https://github.com/camertron/antlr-gemerator to create C++ target with Ruby bindings.

ronaldtse commented 4 years ago

Problem: https://github.com/camertron/antlr4-native-rb uses the Rice gem, which doesn't work on Windows... this means we will either have to port antlr4-native-rb to use FFI or use the pure Ruby version for Windows.

ronaldtse commented 4 years ago

To test whether antlr4-gemerator works I've done some testing but unable to get it to compile:

macOS setup

brew install automake autoconf libtool 
gem install antlr-gemerator

Generating the parser:

# Download grammar
cd /tmp
wget https://github.com/lutaml/express-grammar/releases/download/v1.0/Express.g4

# Create new directory to generate parser (generation will fail if Express.g4 is in the same directory)
mkdir -p express-parser
cd express-parser
antlr-gemerator create \
  --author 'Ribose Inc.' \
  --desc 'An EXPRESS parser for Ruby' \
  --email 'open.source@ribose.com' \
  --homepage 'https://github.com/lutaml/express-parser-rb' \
  --grammar ../Express.g4 \
  --root syntax

Compilation fails:

compiling antlr4-upstream/runtime/Cpp/runtime/src/BufferedTokenStream.cpp
express_parser.cpp:6732:12: error: expected ')'
    return Qnil;
           ^
/Users/me/.rbenv/versions/2.6.5/include/ruby-2.6.0/ruby/ruby.h:468:16: note: expanded from macro 'Qnil'
#define Qnil   RUBY_Qnil
               ^
/Users/me/.rbenv/versions/2.6.5/include/ruby-2.6.0/ruby/ruby.h:464:29: note: expanded from macro 'RUBY_Qnil'
#define RUBY_Qnil   ((VALUE)RUBY_Qnil)
                            ^
express_parser.cpp:6732:12: note: to match this '('
/Users/me/.rbenv/versions/2.6.5/include/ruby-2.6.0/ruby/ruby.h:468:16: note: expanded from macro 'Qnil'
#define Qnil   RUBY_Qnil
               ^
/Users/me/.rbenv/versions/2.6.5/include/ruby-2.6.0/ruby/ruby.h:464:21: note: expanded from macro 'RUBY_Qnil'
#define RUBY_Qnil   ((VALUE)RUBY_Qnil)
                    ^
express_parser.cpp:6732:12: error: reference to non-static member function must be called; did you mean to call it with no arguments?
    return Qnil;
           ^~~~
/Users/me/.rbenv/versions/2.6.5/include/ruby-2.6.0/ruby/ruby.h:468:16: note: expanded from macro 'Qnil'
#define Qnil   RUBY_Qnil
               ^~~~~~~~~
/Users/me/.rbenv/versions/2.6.5/include/ruby-2.6.0/ruby/ruby.h:464:22: note: expanded from macro 'RUBY_Qnil'
#define RUBY_Qnil   ((VALUE)RUBY_Qnil)
                     ^~~~~~~
express_parser.cpp:6742:12: error: expected ')'
    return Qnil;
           ^
...

And these errors about Qnil repeats.

I checked the actual generated express_parser.cpp file and the code is like this:

Object BooleanTypeContextProxy::BOOLEAN() {
  if (orig == nullptr) {
    return Qnil;   // <===== this line
  }
  [...]
}

This post points to the Ruby source https://stackoverflow.com/questions/51384887/where-is-the-nilclass-singleton-instance-instantiated

https://github.com/ruby/ruby/blob/5aa52587e86b9e2b03cad8b78307e53b777f1df2/include/ruby/ruby.h#L410-L443

i.e. on my ruby.h I see this

enum ruby_special_consts {
#if USE_FLONUM
    RUBY_Qfalse = 0x00,     /* ...0000 0000 */
    RUBY_Qtrue  = 0x14,     /* ...0001 0100 */
    RUBY_Qnil   = 0x08,     /* ...0000 1000 */
[...]
#else
    RUBY_Qfalse = 0,        /* ...0000 0000 */
    RUBY_Qtrue  = 2,        /* ...0000 0010 */
    RUBY_Qnil   = 4,        /* ...0000 0100 */
[...]
#endif
    RUBY_SPECIAL_SHIFT  = 8
};

[...]
#define RUBY_Qnil   ((VALUE)RUBY_Qnil)
[...]
#define Qnil   RUBY_Qnil

The express_parser.cpp code looks reasonable, not sure why it's raising errors.

ronaldtse commented 4 years ago

I suppose the "reference to non-static member function must be called" error is explained here: https://stackoverflow.com/a/26331779