adrian-thurston / ragel

Ragel State Machine Compiler
MIT License
532 stars 46 forks source link

[ragel] "ragel: rlhc stopped by signal: 11" in Rust generator #1

Open rljacobson opened 4 years ago

rljacobson commented 4 years ago

I often get this issue:

$ ragel-rust -o scanner.rs scanner_rl.rs
ragel: rlhc stopped by signal: 11

It's difficult to debug. I have had it go away if instead of including a machine from another file I just copy+paste it. Sill diagnosing this.

Edit: If I delete the .ri file I get a (legitimate) parser error and a .ri that doesn't reproduce the original signal 11 error! I have no idea what's going on, but somehow I get the .rl/.ri file combination into a particular bad state, and deleting the .ri resets the state to something that does not necessarily reproduce the bad state...?!? Consequently, I can't consistently reproduce this error. Let me work on it some more until I can.

adrian-thurston commented 4 years ago

Which platform are you on?

Try running with --no-fork option to ragel and use valgrind (if available).

rljacobson commented 4 years ago

I'm on macOS Catalina. I'll try that and report back.

pnck commented 1 year ago

@adrian-thurston

I encountered a similar issue:

==648== Invalid read of size 8
==648==    at 0x49F13FC: input_stream_pop_stash (input.c:108)
==648==    by 0x49F1E0E: input_undo_consume_data (input.c:439)
==648==    by 0x49EA9FB: send_back_text (pdarun.c:157)
==648==    by 0x49EAE5A: send_back (pdarun.c:256)
==648==    by 0x49EF3F4: parse_token (pdarun.c:1755)
==648==    by 0x49F05E2: colm_parse_loop (pdarun.c:2113)
==648==    by 0x49F083F: colm_parse_frag (pdarun.c:2188)
==648==    by 0x4A0CDD1: colm_execute_code (bytecode.c:2755)
==648==    by 0x49FD8C0: colm_execute (bytecode.c:582)
==648==    by 0x4A1D9B8: colm_run_program2 (program.c:222)
==648==    by 0x4A1DA17: colm_run_program (program.c:231)
==648==    by 0x4A74BEF: InputData::runRlhc(int, char const**) (inputdata.cc:1220)
==648==  Address 0x18 is not stack'd, malloc'd or (recently) free'd

https://github.com/adrian-thurston/colm/blob/28b6e0a01157049b4cb279b0ef25ea9dcf3b46ed/src/input.c#L108

si->stash is null right the moment.

apologize for not providing a minimal reproduction due to the complexity of my source, but what I can confirm is that the issue isn't caused by a single machine instance. if I copy the machine definition triggering the crash to a new file, it won't crash at all