jepsen-io / knossos

Verifies the linearizability of experimentally accessible histories.
398 stars 31 forks source link

Add test case showing wgl analysis throws AbstractMethodError. #33

Closed stevana closed 4 years ago

stevana commented 4 years ago

See https://github.com/jepsen-io/jepsen/issues/439 for more details.

aphyr commented 4 years ago

OK, overall I think you're on the right track here! Before I merge this... an AbstractMethodError sounds like something we ought to be able to trigger with a significantly smaller test case. Can I encourage you to shrink this a bit? Like... is it specific to this model? Is every linear check broken? Is the model perhaps broken in some way?

stevana commented 4 years ago

Like... is it specific to this model? Is every linear check broken? Is the model perhaps broken in some way?

Last commit adds a test case showing that the same model and history passes the linear/analysis.

I've tried making the example smaller by using a dummy model and I've also tried making the history shorter, neither worked.

aphyr commented 4 years ago

Neat! I wonder... could it be something to do with memoization? Like... if you shrink the history until you don't see it log "not memoizing...", does it go away?

stevana commented 4 years ago

Last commit adds test that passes with a shorter history, but still prints "INFO knossos.model.memo - More than 1024 reachable models; not memoizing models for this search".

aphyr commented 4 years ago

Huh, cool, so... it's history dependent! That's weird! I wonder if we're triggering some bug in WGL, or if... maybe the model is malformed. Stack trace might help you here.

I'd love to dive in on this myself but I am struggling to meet a deadline :-O

stevana commented 4 years ago

So it seems to pass if we check the subhistory consisting of the the 812 first items of the original history, and fail if we check the subhistory consisting of the first 813 items of the original history. I don't see anything special happening around that time in the history...

maybe the model is malformed.

How can it be that it passes knossos.linear/analysis then?

I'd love to dive in on this myself but I am struggling to meet a deadline :-O

Don't let me distract you.

aphyr commented 4 years ago

Huh, that's neat! There might be some kind of bug in WGL and only that particular state can get us there. AbstractMethodError is awfully suspicious though; that makes it sound like we passed the wrong type of object somewhere. It's a call to .iterator() on wgl.Op, in wgl.clj. Something tried to reduce over an Op--looks like Op->map on line 14, called from model/mismatch, and that got called from model/step.That's where I'd start dropping prns :)

aphyr commented 4 years ago

Oh, and, uh... wgl and linear take very different paths through the history, so it's possible that linear just... didn't trigger whatever model state transition got us into this mess.

stevana commented 4 years ago

(op/Op->map op) in mismatch is indeed the problem, removing op/Op->map fixes it. I don't remember why I added it, presumably to improve the error message.

I'm pretty sure that the code path does get triggered in the linear/analysis case, as I've done many analysis where I've seen the mismatch error message displayed.

Any ideas why .iterator() seems to only be called in the wgl case?

aphyr commented 4 years ago

I'm guessing you only hit the state where that error message gets generated in the WGL search; linear takes a different path. That said... breaking Op->map isn't great! We can fix that by adding an implementation of .iterator to the WGL Op defrecord. :)