digitalbazaar / bedrock-ledger-storage-mongodb

A storage subsystem for Bedrock ledger.
Other
5 stars 3 forks source link

All `insertMany` usage must be audited to ensure it is tested for proper duplicate handling #52

Open dlongley opened 4 years ago

dlongley commented 4 years ago

Add tests if duplicate cases aren't being tested. Note: It is likely that the index field in the bulk error has changed for the mongo 3.x driver.

aljones15 commented 4 years ago

References:

https://stackoverflow.com/a/46841840

try: e.hasWriteErrors() and e.getWriteErrors() https://docs.mongodb.com/manual/reference/method/db.collection.bulkWrite/#bulkwrite-error-handling-txn

https://docs.mongodb.com/manual/reference/method/db.collection.bulkWrite/#bulk-write-operations

https://github.com/digitalbazaar/bedrock-ledger-consensus-continuity/blob/master/lib/agents/event-writer-agent.js#L111

Stuff that needs to be double checked:

aljones15 commented 4 years ago

what a mongo duplicate error looks like:

 MongoError: E11000 duplicate key error collection: bedrock_ledger_storage_mongodb_test.7911d60d-9e08-4a67-b73e-6b3bed62e77e-event index: event.eventHash.core.2 dup key: { meta.eventHash: "zQmdixV9wxaR9tcdbi4vUmFK1q7B5mt8HPsGAdHsyPsyWz5", meta.blockHeight: 30 }
    at Function.create (/Users/andrewjones/Programs/javascript/digital_bazaar/bedrock-ledger-storage-mongodb/test/node_modules/mongodb/lib/core/error.js:57:12)
    at toError (/Users/andrewjones/Programs/javascript/digital_bazaar/bedrock-ledger-storage-mongodb/test/node_modules/mongodb/lib/utils.js:123:22)
    at /Users/andrewjones/Programs/javascript/digital_bazaar/bedrock-ledger-storage-mongodb/test/node_modules/mongodb/lib/operations/common_functions.js:265:39
    at handler (/Users/andrewjones/Programs/javascript/digital_bazaar/bedrock-ledger-storage-mongodb/test/node_modules/mongodb/lib/core/sdam/topology.js:942:24)
    at /Users/andrewjones/Programs/javascript/digital_bazaar/bedrock-ledger-storage-mongodb/test/node_modules/mongodb/lib/cmap/connection_pool.js:350:13
    at handleOperationResult (/Users/andrewjones/Programs/javascript/digital_bazaar/bedrock-ledger-storage-mongodb/test/node_modules/mongodb/lib/core/sdam/server.js:558:5)
    at MessageStream.messageHandler (/Users/andrewjones/Programs/javascript/digital_bazaar/bedrock-ledger-storage-mongodb/test/node_modules/mongodb/lib/cmap/connection.js:277:5)
    at MessageStream.emit (events.js:311:20)
    at MessageStream.EventEmitter.emit (domain.js:482:12)
    at processIncomingData (/Users/andrewjones/Programs/javascript/digital_bazaar/bedrock-ledger-storage-mongodb/test/node_modules/mongodb/lib/cmap/message_stream.js:144:12) {
  driver: true,
  name: 'MongoError',
  index: 0,
  code: 11000,
  keyPattern: { 'meta.eventHash': 1, 'meta.blockHeight': 1 },
  keyValue: {
    'meta.eventHash': 'zQmdixV9wxaR9tcdbi4vUmFK1q7B5mt8HPsGAdHsyPsyWz5',
    'meta.blockHeight': 30
  }
}
aljones15 commented 4 years ago

ok replicated:

BulkWriteError: E11000 duplicate key error collection: bedrock_ledger_storage_mongodb_test.67f2e3b7-819f-400c-ac35-f14d97e29d8c-event index: event.eventHash.core.3 dup key: { meta.eventHash: "zQmboaxXcqGVTqTcasDSrrHPb2HWa35DkHs8eCiQHRznx2G", meta.consensus: null }
    at OrderedBulkOperation.handleWriteError (/Users/andrewjones/Programs/javascript/digital_bazaar/bedrock-ledger-storage-mongodb/test/node_modules/mongodb/lib/bulk/common.js:1257:9)
    at resultHandler (/Users/andrewjones/Programs/javascript/digital_bazaar/bedrock-ledger-storage-mongodb/test/node_modules/mongodb/lib/bulk/common.js:521:23)
    at handler (/Users/andrewjones/Programs/javascript/digital_bazaar/bedrock-ledger-storage-mongodb/test/node_modules/mongodb/lib/core/sdam/topology.js:942:24)
    at /Users/andrewjones/Programs/javascript/digital_bazaar/bedrock-ledger-storage-mongodb/test/node_modules/mongodb/lib/cmap/connection_pool.js:350:13
    at handleOperationResult (/Users/andrewjones/Programs/javascript/digital_bazaar/bedrock-ledger-storage-mongodb/test/node_modules/mongodb/lib/core/sdam/server.js:558:5)
    at MessageStream.messageHandler (/Users/andrewjones/Programs/javascript/digital_bazaar/bedrock-ledger-storage-mongodb/test/node_modules/mongodb/lib/cmap/connection.js:277:5)
    at MessageStream.emit (events.js:311:20)
    at MessageStream.EventEmitter.emit (domain.js:482:12)
    at processIncomingData (/Users/andrewjones/Programs/javascript/digital_bazaar/bedrock-ledger-storage-mongodb/test/node_modules/mongodb/lib/cmap/message_stream.js:144:12)
    at MessageStream._write (/Users/andrewjones/Programs/javascript/digital_bazaar/bedrock-ledger-storage-mongodb/test/node_modules/mongodb/lib/cmap/message_stream.js:42:5) {
  name: 'BulkWriteError',
  driver: true,
  code: 11000,
  writeErrors: [ WriteError { err: [Object] } ],
  result: BulkWriteResult {
    result: {
      ok: 1,
      writeErrors: [Array],
      writeConcernErrors: [],
      insertedIds: [Array],
      nInserted: 1,
      nUpserted: 0,
      nMatched: 0,
      nModified: 0,
      nRemoved: 0,
      upserted: []
    }
  }
}

as predicted bulkWrite does not contain an index in the error.

aljones15 commented 4 years ago

this is what a BulkWrite error looks like when you are inserting 5 events with one duplicate in the middle:

e {
  "name": "BulkWriteError",
  "driver": true,
  "code": 11000,
  "writeErrors": [
    {
      "code": 11000,
      "index": 2,
      "errmsg": "E11000 duplicate key error collection: bedrock_ledger_storage_mongodb_test.b4eb578f-35d2-4a85-9da1-9d5cadafb5fc-event index: _id_ dup key: { _id: ObjectId('5f7f257bafe70d4c469f9420') }",
      "op": {
        "event": {
          "@context": "https://w3id.org/webledger/v1",
          "type": "WebLedgerOperationEvent",
          "operationHash": [
            "zQmYPbVdeRVEC7KmJncdRvjwury4AZcaFvnd3fJWd1zdQQa"
          ]
        },
        "meta": {
          "consensus": false,
          "eventHash": "zQmeGrLQYxCbK9d37pRgvLYPdvtrXJyS2RdAxVkR4vRMkBq"
        },
        "_id": "5f7f257bafe70d4c469f9420"
      }
    }
  ],
  "result": {
    "ok": 1,
    "writeErrors": [
      {
        "code": 11000,
        "index": 2,
        "errmsg": "E11000 duplicate key error collection: bedrock_ledger_storage_mongodb_test.b4eb578f-35d2-4a85-9da1-9d5cadafb5fc-event index: _id_ dup key: { _id: ObjectId('5f7f257bafe70d4c469f9420') }",
        "op": {
          "event": {
            "@context": "https://w3id.org/webledger/v1",
            "type": "WebLedgerOperationEvent",
            "operationHash": [
              "zQmYPbVdeRVEC7KmJncdRvjwury4AZcaFvnd3fJWd1zdQQa"
            ]
          },
          "meta": {
            "consensus": false,
            "eventHash": "zQmeGrLQYxCbK9d37pRgvLYPdvtrXJyS2RdAxVkR4vRMkBq"
          },
          "_id": "5f7f257bafe70d4c469f9420"
        }
      }
    ],
    "writeConcernErrors": [],
    "insertedIds": [
      {
        "index": 0,
        "_id": "5f7f257bafe70d4c469f9420"
      },
      {
        "index": 1,
        "_id": "5f7f257bafe70d4c469f9421"
      },
      {
        "index": 2,
        "_id": "5f7f257bafe70d4c469f9420"
      },
      {
        "index": 3,
        "_id": "5f7f257bafe70d4c469f9422"
      },
      {
        "index": 4,
        "_id": "5f7f257bafe70d4c469f9423"
      }
    ],
    "nInserted": 2,
    "nUpserted": 0,
    "nMatched": 0,
    "nModified": 0,
    "nRemoved": 0,
    "upserted": []
  }
}
dlongley commented 4 years ago

@aljones15 -- that output says "nInserted": 2. Can you confirm that 4 records are actually in the database? This seems like odd output.

aljones15 commented 4 years ago

@dlongley the test does confirm all records that should be there are (plus one from a before step). It also checks that we get the correct dupHashes back etc.

duplicate event tests are here:

https://github.com/digitalbazaar/bedrock-ledger-storage-mongodb/blob/876c06b43a58cbbc80d75c68b1eae0f16f26b77f/test/mocha/30-event-api.js#L278-L501

dlongley commented 4 years ago

@aljones15 -- we're miscommunicating, I think. I was asking if a single call to the mongodb driver's insertMany inserted all the records. This test looks like it uses ledgerStorage.events.addMany which will repeatedly make calls to insertMany. So, my question is -- did we determine that those repeated calls are necessary? Does a single call to insertMany with a duplicate in the middle only insert the first half of the events when ordered: true?

aljones15 commented 4 years ago

@dlongley the answer to your question is all the tests continue to pass after removing the chunks:

https://github.com/digitalbazaar/bedrock-ledger-storage-mongodb/compare/add-event-addMany-tests...no-chunks

however the chunking actually returns an array which has been shrunk down to make a BulkWrite less likely to fail by being 5% less than the maxBatch size:

https://github.com/digitalbazaar/bedrock-ledger-storage-mongodb/blob/36016ac8c71c55e870d4df8fee84aa9623f960f2/lib/util.js#L8-L40

this is the node mongo driver 3.5 max batch size: https://github.com/mongodb/node-mongodb-native/blob/a27604e0340519845e6f6f91cec2b3ee50bd37cc/lib/bulk/common.js#L760

which appears to be the same as the maxBatch size @mattcollier referenced originally so I think we need to leave the chunking in.

@mattcollier comments?

dlongley commented 4 years ago

@aljones15, we're still miscommunicating. We should not change the chunking.

I'm talking about removing the while loop so it just calls insertMany once per chunk -- because when duplicates occur, all of the events are still inserted, so there's no reason to "keep going to insert the result of the events after the first duplicate error". <-- please prove that this is true or false.

aljones15 commented 4 years ago

I removed the while loop and:

Running Mocha tests...

  Event Storage API
    addMany API
      ��� should add many events (71ms)
      1) should insert non-duplicate events
      2) should add events even if first event is a duplicate
      ��� should add events even if last event is a duplicate (70ms)
      ��� throws TypeError if `meta` is omitted
      3) should not throw on duplicate event

  3 passing (33s)
  3 failing

  1) Event Storage API
       addMany API
         should insert non-duplicate events:

      AssertionError: expected 3 to equal 5
      + expected - actual

      -3
      +5

      at /home/liminal18/Programs/nodeJS/digital_bazaar/bedrock-ledger-storage-mongodb/test/mocha/30-event-api.js:336:37
      at processTicksAndRejections (internal/process/task_queues.js:93:5)

  2) Event Storage API
       addMany API
         should add events even if first event is a duplicate:

      AssertionError: expected 2 to equal 6
      + expected - actual

      -2
      +6

      at /home/liminal18/Programs/nodeJS/digital_bazaar/bedrock-ledger-storage-mongodb/test/mocha/30-event-api.js:410:37
      at processTicksAndRejections (internal/process/task_queues.js:93:5)

  3) Event Storage API
       addMany API
         should not throw on duplicate event:

      Uncaught AssertionError: expected 1 to equal 4
      + expected - actual

      -1
      +4

      at /home/liminal18/Programs/nodeJS/digital_bazaar/bedrock-ledger-storage-mongodb/test/mocha/30-event-api.js:560:43
      at cb (util.js:207:31)
      at processTicksAndRejections (internal/process/task_queues.js:80:21)

Tests failed.
2020-10-09T18:49:09.955Z - critical: [bedrock/master] worker "2273" exited with code "1"
npm ERR! Test failed.  See above for more details.

so the answer is no. It does stop on error.

dlongley commented 4 years ago

Ok, great, thanks for confirming.