m-ld / m-ld-js

m-ld Javascript engine
https://js.m-ld.org
MIT License
37 stars 2 forks source link

Time ... is from a different process group #148

Closed gsvarovsky closed 1 year ago

gsvarovsky commented 1 year ago

Occasionally seen in compliance test Starting chaos converges with clones starting from snapshot:

Failures:
1) Starting chaos converges with clones starting from snapshot
  Message:
    Unhandled promise rejection: Error: Error: Time [1,0,[3]] is from a different process group @ [0,"kwOTApEDAAM="]
  Stack:
        at /Users/george/m-ld-js/node_modules/@m-ld/m-ld-spec/compliance/clone.js:209:20
        at runMicrotasks (<anonymous>)
        at processTicksAndRejections (node:internal/process/task_queues:96:5)
        at async checkStatus (/Users/george/m-ld-js/node_modules/@m-ld/m-ld-spec/compliance/clone.js:208:11)
        at async send (/Users/george/m-ld-js/node_modules/@m-ld/m-ld-spec/compliance/clone.js:185:15)
        at async ChaosClone.transact (/Users/george/m-ld-js/node_modules/@m-ld/m-ld-spec/compliance/clone.js:123:22)
        at async time (/Users/george/m-ld-js/node_modules/@m-ld/m-ld-spec/compliance/2-chaos/chaos.js:39:18)
        at async ChaosClone.transact (/Users/george/m-ld-js/node_modules/@m-ld/m-ld-spec/compliance/2-chaos/chaos.js:82:30)
        at async ChaosClone.start (/Users/george/m-ld-js/node_modules/@m-ld/m-ld-spec/compliance/2-chaos/chaos.js:71:7)
        at async ChaosTest.<anonymous> (/Users/george/m-ld-js/node_modules/@m-ld/m-ld-spec/compliance/2-chaos/3-starting-chaos.spec.js:20:9)

Log:

2023-04-07T07:35:39.133Z Error: Time [1,0,[3]] is from a different process group @ [0,"kwOTApEDAAM="]
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:372:23)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at Object.next (/Users/george/m-ld-js/ext/engine/journal/JournalState.js:66:76)
    at SuSetDataset.<anonymous> (/Users/george/m-ld-js/ext/engine/dataset/SuSetDataset.js:224:58)
    at Generator.next (<anonymous>)
    at fulfilled (/Users/george/m-ld-js/ext/engine/dataset/SuSetDataset.js:11:58)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
2023-04-07T07:35:39.155Z 82878718 SuSetDataset Error: Time [1,[5],0] is from a different process group @ [0,"kwOTApEDAAM="]
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:372:23)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at Object.next (/Users/george/m-ld-js/ext/engine/journal/JournalState.js:66:76)
    at SuSetDataset.OperationApplication.<anonymous> (/Users/george/m-ld-js/ext/engine/dataset/SuSetDataset.js:491:33)
    at Generator.next (<anonymous>)
    at fulfilled (/Users/george/m-ld-js/ext/engine/dataset/SuSetDataset.js:11:58)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5) [{},{"@id":"entity8","@type":"Entity","prop0":"3"}]
2023-04-07T07:35:39.156Z 82878718 DatasetEngine Shutting down due to Error: Time [1,[5],0] is from a different process group @ [0,"kwOTApEDAAM="]
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:372:23)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at Object.next (/Users/george/m-ld-js/ext/engine/journal/JournalState.js:66:76)
    at SuSetDataset.OperationApplication.<anonymous> (/Users/george/m-ld-js/ext/engine/dataset/SuSetDataset.js:491:33)
    at Generator.next (<anonymous>)
    at fulfilled (/Users/george/m-ld-js/ext/engine/dataset/SuSetDataset.js:11:58)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
2023-04-07T07:35:39.156Z Error: Time [1,[5],0] is from a different process group @ [0,"kwOTApEDAAM="]
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:372:23)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at Object.next (/Users/george/m-ld-js/ext/engine/journal/JournalState.js:66:76)
    at SuSetDataset.OperationApplication.<anonymous> (/Users/george/m-ld-js/ext/engine/dataset/SuSetDataset.js:491:33)
    at Generator.next (<anonymous>)
    at fulfilled (/Users/george/m-ld-js/ext/engine/dataset/SuSetDataset.js:11:58)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
2023-04-07T07:35:39.156Z 82878718 SuSetDataset Shutting down due to Error: Time [1,[5],0] is from a different process group @ [0,"kwOTApEDAAM="]
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:372:23)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at Object.next (/Users/george/m-ld-js/ext/engine/journal/JournalState.js:66:76)
    at SuSetDataset.OperationApplication.<anonymous> (/Users/george/m-ld-js/ext/engine/dataset/SuSetDataset.js:491:33)
    at Generator.next (<anonymous>)
    at fulfilled (/Users/george/m-ld-js/ext/engine/dataset/SuSetDataset.js:11:58)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
/Users/george/m-ld-js/node_modules/rxjs/dist/cjs/internal/util/reportUnhandledError.js:13
            throw err;
            ^

Error: Time [1,[5],0] is from a different process group @ [0,"kwOTApEDAAM="]
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:372:23)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at GlobalClock.set (/Users/george/m-ld-js/ext/engine/clocks.js:374:58)
    at Object.next (/Users/george/m-ld-js/ext/engine/journal/JournalState.js:66:76)
    at SuSetDataset.OperationApplication.<anonymous> (/Users/george/m-ld-js/ext/engine/dataset/SuSetDataset.js:491:33)
    at Generator.next (<anonymous>)
    at fulfilled (/Users/george/m-ld-js/ext/engine/dataset/SuSetDataset.js:11:58)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
82878718 Error: read ECONNRESET
    at TCP.onStreamRead (node:internal/stream_base_commons:217:20) {
  errno: -54,
  code: 'ECONNRESET',
  syscall: 'read'
}
gsvarovsky commented 1 year ago

Analysis

  1. A new clone is getting a new clock which includes a certain message
  2. The snapshot request goes to a different collaborator, which has not received that message, and therefore does not include it in the GWC
  3. The new clone accepts the snapshot, but rejects the message as outdated, since its clock includes it
  4. When a message after the missed message is accepted, the GWC may see the clock's fork as invalid – actually this is a better outcome than the alternative, which is that messages are accepted without their causes.