automerge / automerge-classic

A JSON-like data structure (a CRDT) that can be modified concurrently by different users, and merged again automatically.
http://automerge.org/
MIT License
14.75k stars 466 forks source link

Fix invalid opSet getMissingChanges return value #362

Closed adelsz closed 3 years ago

adelsz commented 3 years ago

Automerge 1.0.1-preview.0 has been occasionally (hard to consistently reproduce) crashing for me with the following message:

TypeError: Not a byte array: [object Object]
    at new Decoder (automerge-backend/node_modules/automerge-preview/dist/webpack:/Automerge/backend/encoding.js:304:13)
    at decodeChangeMeta (automerge-backend/node_modules/automerge-preview/dist/webpack:/Automerge/backend/columnar.js:778:40)
    at map (automerge-backend/node_modules/automerge-preview/dist/webpack:/Automerge/backend/sync.js:263:20)
    at Array.map (<anonymous>)
    at getChangesToSend (automerge-backend/node_modules/automerge-preview/dist/webpack:/Automerge/backend/sync.js:263:6)

After some debugging I found out that even though the Decoder constructor expects an ArrayBuffer, it would sometimes be passed an object of the form {type: "Buffer", data: [...]} causing the above crash.

I managed to track the issue down to the getMissingChanges function. It looks like getMissingChanges is supposed to always return an array of arraybuffers, but it in some rare cases it returns an array of objects instead (causing the decoder crash). Compare the return type of the branch here with this one.

The issue was caused by calling .toJSON() on its final result. The .toJSON() method does deep conversion on the result, transforming array buffers into objects of the form {type: "Buffer", data: [...]} instead of keeping them as ArrayBuffers.

Replacing .toJSON() with .toArray() fixes this behavior.

ept commented 3 years ago

Hi @adelsz, thank you for this contribution — great work tracking down that exception. I'm happy to merge it, but I'd also like to reproduce it and add a test that will catch this kind of thing in the future, and I haven't managed to reproduce it yet in my environments (I tested in Node 16.0, Chrome, Firefox, and Safari). Can you tell me a bit more about the environment you're running in? Do you perhaps have something adding a .toJSON() method to the prototype of Uint8Array?

ept commented 3 years ago

Here is a minimal test case:

const { List } = require('immutable')
new List([new Uint8Array()]).toJSON()[0] instanceof Uint8Array
// returns true on my machine, but looks like it should return false for you
adelsz commented 3 years ago

Hey @ept, thanks for merging this so quickly.

It seems that the original type is Array<Buffer> not Array<Uint8Array>? Here is a runnable snippet demonstrating the difference between .toJSON and .toArray:

const { List } = require('immutable')
let buffer = Buffer.from([1,2,3]);
let a = new List([buffer]).toJSON();
let b = new List([buffer]).toArray();

console.log(a);
// prints [Object {type: "Buffer", data: [1, 2, 3]}]

console.log(b);
// prints [Buffer <01, 02, 03>]

Seems like immutable.js calls .toJSON recursively on all nested data structures. Buffer implements its own .toJSON method that returns a {type: "Buffer", data: [...]} object.

ept commented 3 years ago

Ah, got it now. Automerge itself always encodes changes as Uint8Array, not as Buffer (since Buffer is Node-specific). However, I suspect that the network or I/O library you're using is turning them into Buffer objects as they are received over the network or loaded from disk. That is normally fine, since Buffer objects are instanceof Uint8Array, and so Automerge happily accepts and works with those Buffer objects. The difference between Buffer and plain Uint8Array only shows up when trying to call .toJSON() on them.

In 6f52350cb10479345276e2e01fde7cedc18cdc6c I updated the test to convert a change into a Buffer object when running in Node, and it then reproduces the bug you found. So all is good now.

Thank you for your help with this!