cliffclick / aa

Cliff Click Language Hacking
Apache License 2.0
276 stars 22 forks source link

Avoid ArrayIndexOutOfBoundsException in MemJoinNode#value #19

Closed beickhoff closed 3 years ago

beickhoff commented 3 years ago

I ran into a lot of errors like this:

Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 3 out of bounds for length 3
        at com.cliffc.aa.node.MemJoinNode.value(MemJoinNode.java)
        at com.cliffc.aa.node.Node.do_flow(Node.java)
        …

(Side note: I'm still trying to understand why the line numbers are missing from these stack traces, because the LineNumberTables are definitely present in the class files. I'm guessing it's a limitation the testing tool that I'm using.)

It turns out there is a potential problem in MemJoinNode#value.

      if( escs.at(0).test_recur(alias) ) { // In some RHS set
        for( i=1; i<_defs._len; i++ )
          if( escs.at(i).test_recur(alias) )
            break;
      } else i=0;                     // In the base memory
      if( alias == 1 || Env.DEFMEM.in(alias) != null ) // Check never-made aliases
        pubs[alias] = mems[i].at(alias); // Merge alias
                      ^^^^^^^

If we take the first branch, i can be incremented just out of bounds, i.e. i == _defs.len at the end of the loop, in which case mems[i] will throw an exception.

In this case -- if we make it through the loop without a hit -- I'm guessing i should be set to zero. Correct me if I'm wrong here.

cliffclick commented 3 years ago

The bug is real but the fix is (probably) wrong. Once the first test is passed the invariant is: you MUST be able to find the index; the initial test is a fast cutout. This means the rollup summary mentions an alias that is not in any breakout part. Can you send me a test cast?

beickhoff commented 3 years ago

Ok, so if everything else is working as designed, it should not be possible to hit an index-out-of-bounds exception here. It's probably okay to leave everything as is then. Maybe it's worth adding an extra assertion, but that would be it.

This makes sense, because there's no single test case that reproduces this error. It seems to come up because of state bugs in prior tests.

cliffclick commented 3 years ago

Can you make a method full of tests (like the TestParse.parse00 tests) where it triggers somewhere in there?

beickhoff commented 3 years ago

I'll close out this pull request since the fix is wrong.

I have to go back and rework my approach for testing. Right now, I can easily reproduce a single test case that witnessed a failure, but most of the bugs I'm finding are going to require a series of tests to reproduce. So yes, I'd like to be able to provide a method full of test cases that can reproduce bugs like this, but it'll take me some time to get to that point.