hypercore-protocol / hypercore-next

Hypercore 10 is an append only log with multi-writer primitives built in.
MIT License
150 stars 15 forks source link

Emit an event on corrupted hypercores #76

Open Nuhvi opened 2 years ago

Nuhvi commented 2 years ago

By corrupted I mean branched as in The linear history requirement:

... otherwise they will generate branches and "corrupt" the hypercore.

Currently _onfork is called by default whenever a later fork is received and there is a TODO comment for converting that to an opt-in behavior.

What is the plan in the case of a discovered fork, in a Hypercore where the user didn't opt-in for reorgs?

I tried to check what is the current behavior in the case of non-linear history in Hypercore ^9.0.0 but couldn't find anything.

Would the following behavior make sense in the context of your plans?:

Emit an event (e.g corrupted) to notify the consumer that the Hypercore had a fork, as well as the proof of that fork (two signed roots with the same seq, I guess), so applications can keep those proofs in a block-list of duplicitous cores, while purging the core history

// In replicator.js

 async ondata (proof, peer) {
    // ...
    if (peer.state.fork !== this.core.tree.fork) {
      if (!this.allowForks) {
        this.core.emit("corrupted", this.core.tree.findDiff(proof));
      } else if (peer.state.fork > this.core.tree.fork) {
        return this._onfork(proof, peer)
      }
      return
    }
    //...
 }
// In merkle-tree.js

  async findDiff (proof) {
    const batch = new ReorgBatch(this)

     for (const root of batch.roots) {
      const existing = await this.get(root.index, false)
      if (existing && b4a.equals(existing.hash, root.hash)) continue
      return [existing, root]
    }
  }

So applications can use that as follows:

const blockList = new Map()

const core = new Hypercore(storage, key)
await core.ready()

core.on('corrupted', ([root1, root2]) => {
  blocList.set(core.key, [root1, root2]);
  await core.close()
  core.storage.destroy() // Need to test this, but I hope this is possible.
})

const swarm = new Hyperswarm()
swarm.on('connection', socket => core.replicate(socket))
swarm.join(core.discoveryKey, { server: true, client: true })
mafintosh commented 2 years ago

Current behaviour in both 9 and 10 is that the replication stream errors with a signature error. In 10 we have the plumbing to save the error to the oplog so it can be gossiped to other peers, so we’ll do that at some point

Nuhvi commented 2 years ago

I also tried to see what happens if I create two hypercores with the same publicKey, and different, history, I couldn't get it to throw any errors, what am I doing wrong here: https://gist.github.com/Nazeh/b4ed3b2e346741df0ec8728b2d2613e6

mafintosh commented 2 years ago

Ya, that example doesn't know it's been forked yet - the one request results in a valid proof from from of the peers. Try making multiple requests - note the error is emitted on the replication stream.

I don't think we have a test case as well for this in 10, if you wanna contribute that.

Nuhvi commented 2 years ago

I am working on a test for this, but when both forks have the same length, I get a flaky test, where it sometimes gets the block from the original core a, and sometimes it throws an error Invalid checksum at node 4

  test('......', async function (t) {
  const keyPair = crypto.keyPair();
  const a = await create({ keyPair });
  const b = await create(a.key, { keyPair: keyPair });

  await a.append(['a0', 'a1', 'a2']);
  await b.append(['a0', 'b1', 'b2']);

  const c = await create(a.key);

  t.is(a.key === b.key && a.key === c.key, true);

  replicate(c, a, t);
  const [stream] = replicate(c, b, t);

  const request = async (n = 0) => (await c.get(n)).toString();

  const first = await request(1);
  console.log({ first });
  t.is(first, 'a1');

  stream.on('error', (err) => t.fail(err.message));

  const second = await request(2);
  console.log({ second });
  t.is(second, 'b2');
});

When the fork is bigger, it reliably detect the fork though, this whole setup is sensitive to what core was replicated first, I don't know what to do to make this detection more reliable. And hopefully in the future we can have an error message for this case of malicious forking of one's core.


test('...', async function (t) {
  const keyPair = crypto.keyPair();
  const a = await create({ keyPair });
  const b = await create(a.key, { keyPair: keyPair });

  await a.append(['a0', 'a1']);
  await b.append(['a0', 'b1', 'b2']);

  const c = await create(a.key);

  t.is(a.key === b.key && a.key === c.key, true);

  replicate(c, a, t);

  const [stream] = replicate(c, b, t);

  const threw = new Promise((resolve) =>
    stream.on('error', (err) => resolve(err)),
  );

  const request = async (n = 0) => (await c.get(n)).toString();

  const first = await request(1);
  t.is(first, 'a1');

  request(2);
  t.is((await threw).message, 'Remote signature does not match');
});