jepsen-io / knossos

Verifies the linearizability of experimentally accessible histories.
398 stars 31 forks source link

Handling of failed CAS and timeouts #5

Closed ahorn closed 9 years ago

ahorn commented 9 years ago

I've run into an issue with the register model where Knossos says a certain history is linearizable when, in fact, it looks non-linearizable.

The history under question is shown in etcd_005.log. Under certain assumptions (to be clarified below), the history stops to be linerizable at the following entry on line 131:

  INFO  jepsen.util - 16    :ok :read   4

In inspecting this history log, I made two assumptions. Firstly, I assumed that failed CAS instructions have no effect on the state of the model. For example, at the beginning of the log, we have:

  INFO jepsen.util - 1 :invoke :cas [0 1]
  INFO jepsen.util - 1 :fail :cas [0 1]

I ignored these kind of failed CAS instructions. Secondly, I ignored timed out entries. For example, I ignored

  INFO jepsen.util - 0 :invoke :write 4
  INFO jepsen.util - 0 :info :write :timed-out

on line 83 and line 89, respectively.

Under those assumptions, the above history is non-linearizable. I wanted to double check if this is expected, or not. If my assumptions are too strong, I'd appreciate if you could clarify what Knossos assumes about failed CAS and timeouts.

ahorn commented 9 years ago

Okay, so it looks like Knossos treats a timed out invocation such that it has a return entry "r" where "r" is at the end of the history and "r" has a non-deterministic value. That would make sense.

aphyr commented 9 years ago

See https://aphyr.com/posts/314-computational-techniques-in-knossos. Process 3's write of the value 4 could complete just prior to reading 4.

ahorn commented 9 years ago

Thanks! =)