nautilus-fuzz / nautilus

A grammar based feedback Fuzzer
MIT License
414 stars 63 forks source link

ANTLR support #1

Open h3ku opened 4 years ago

h3ku commented 4 years ago

Hi,

I realise that the ANTLR parser don't seem to be in this new version of nautilus, is the import of ANTLR grammars into nautilus no longer supported?

eqv commented 4 years ago

That is correct. We found that ANTLR grammars are not worth it. They typically don't include relevant Whitespaces as they assume prior tokenization. If for some reason you need to use ANTLR grammars, you can use the importer from the old version (https://github.com/RUB-SysSec/nautilus/blob/master/antlr_parser/src/bin.rs), which turns the ANTL grammar in a .json grammar. The .json grammar can still be loaded.

5hadowblad3 commented 4 years ago

Along with this question, when I try to parse the grammar provided by ANTLR (https://github.com/antlr/grammars-v4/blob/master/smtlibv2/SMTLIBv2.g4), I find that the antlr_parser provides error during translation,

thread 'main' panicked at 'u32 could not be converted to char', src/libcore/option.rs:1190:5
stack backtrace:
   0: backtrace::backtrace::libunwind::trace
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.37/src/backtrace/libunwind.rs:88
   1: backtrace::backtrace::trace_unsynchronized
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.37/src/backtrace/mod.rs:66
   2: std::sys_common::backtrace::_print_fmt
             at src/libstd/sys_common/backtrace.rs:76
   3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
             at src/libstd/sys_common/backtrace.rs:60
   4: core::fmt::write
             at src/libcore/fmt/mod.rs:1030
   5: std::io::Write::write_fmt
             at src/libstd/io/mod.rs:1412
   6: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:64
   7: std::sys_common::backtrace::print
             at src/libstd/sys_common/backtrace.rs:49
   8: std::panicking::default_hook::{{closure}}
             at src/libstd/panicking.rs:196
   9: std::panicking::default_hook
             at src/libstd/panicking.rs:210
  10: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:473
  11: std::panicking::continue_panic_fmt
             at src/libstd/panicking.rs:380
  12: rust_begin_unwind
             at src/libstd/panicking.rs:307
  13: core::panicking::panic_fmt
             at src/libcore/panicking.rs:85
  14: core::option::expect_failed
             at src/libcore/option.rs:1190
  15: core::option::Option<T>::expect
             at /rustc/66bf391c3aabfc77f5f7139fc9e6944f995d574e/src/libcore/option.rs:345
  16: antrl_parser::lib::AntlrParser::parse_regex
             at antlr_parser/src/lib.rs:751
  17: antrl_parser::lib::AntlrParser::parse_definition
             at antlr_parser/src/lib.rs:430
  18: antrl_parser::lib::AntlrParser::parse_antlr_grammar
             at antlr_parser/src/lib.rs:108
  19: antrl_parser::main
             at antlr_parser/src/bin.rs:15
  20: std::rt::lang_start::{{closure}}
             at /rustc/66bf391c3aabfc77f5f7139fc9e6944f995d574e/src/libstd/rt.rs:64
  21: std::rt::lang_start_internal::{{closure}}
             at src/libstd/rt.rs:49
  22: std::panicking::try::do_call
             at src/libstd/panicking.rs:292
  23: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:80
  24: std::panicking::try
             at src/libstd/panicking.rs:271
  25: std::panic::catch_unwind
             at src/libstd/panic.rs:394
  26: std::rt::lang_start_internal
             at src/libstd/rt.rs:48
  27: std::rt::lang_start
             at /rustc/66bf391c3aabfc77f5f7139fc9e6944f995d574e/src/libstd/rt.rs:64
  28: main
  29: __libc_start_main
  30: _start
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Is there any idea about this issue?

eqv commented 4 years ago

I would guess that the problem is antlr_parser/src/lib.rs:751. It seems like some regex in the ANTLR grammar file is not compatible with rusts regexes... You can probably ad line number information to the error manually to help you debug. I'm not really interested in supporting ANTLR parsing as it has many issues with tokens that get removed in the tokenization phase (e.g. whitespaces). I would recommend to create a similar grammar by hand.