automerge / automerge-classic

A JSON-like data structure (a CRDT) that can be modified concurrently by different users, and merged again automatically.
http://automerge.org/
MIT License
14.75k stars 466 forks source link

Can't send changes over network #378

Closed benjaminfayaz closed 3 years ago

benjaminfayaz commented 3 years ago

Version used: 1.0.1-preview.2 I just started working with automerge a few days ago, so not much experience.

I am trying to accomplish the synchronization of a JSON document with a websocket server between the clients. Currently the server does nothing but sending the messages to all clients except for the sender. This is the document used for this test:

doc = Automerge.from({
    text: new Automerge.Text(),
  });

When changing a document on one client and sending the changes like so:

const changes = Automerge.getChanges(this.doc, newDoc);
this.ws.send(JSON.stringify(changes))

and parsing them on another client like so:

const changes = JSON.parse(message.data);
Automerge.applyChanges(this.doc, changes)

I get the following error:

ERROR TypeError: Not a byte array: [object Object]
    Decoder encoding.js:304

Also tried another approach with generateSyncMessage and receiveSyncMessage. However I am not sure whether I used that correctly. Sender:

const [syncState, binarySyncMsg] = Automerge.generateSyncMessage(
      this.doc,
      this.syncState
    );
this.ws.send(JSON.stringify(binarySyncMsg))

Recipient:

const binarySyncMsg = JSON.parse(message.data);
const receiveSyncMsg = Automerge.receiveSyncMessage(
      this.doc,
      this.syncState,
      binarySyncMsg
    );
this.doc = Automerge.Frontend.applyPatch(this.doc, receiveSyncMsg[2])

This resulted in the same error.

Is this even the correct approach to syncing the document? Or should my server handle parts of the merging? Also should I rather use the generateSyncMessage and receiveSyncMessage instead of sending the changes directly?

ept commented 3 years ago

Hi @benjaminfayaz, sorry about this — the API has changed in the 1.0-preview version, and we haven't yet updated the documentation in the README. You now no longer need JSON.stringify or JSON.parse; instead, Automerge.getChanges returns byte arrays (Uint8Array) that you can send over the network. The same goes for the message returned by generateSyncMessage.

Check that your network protocol supports byte arrays. I think websocket libraries generally should support them. If it only supports strings, you'll have to encode the byte arrays e.g. using Base64.

Steve-OH commented 3 years ago

Would it make sense to provide functions on the Automerge object for doing Uint8Array/base64 conversions? None of the built-in JavaScript functionality for this is very efficient, but node (v 5.0+) has better functionality that results in compact strings:

const b64 = Buffer.from(arr).toString('base64');
const arr = Buffer.from(b64, 'base64');

Considering how many bad roll-your-own implementations of this exist out in the wild (just check StackOverflow...), it might result in fewer headaches all around.

DiamondYuan commented 3 years ago

Would it make sense to provide functions on the Automerge object for doing Uint8Array/base64 conversions? None of the built-in JavaScript functionality for this is very efficient, but node (v 5.0+) has better functionality that results in compact strings:

const b64 = Buffer.from(arr).toString('base64');
const arr = Buffer.from(b64, 'base64');

Considering how many bad roll-your-own implementations of this exist out in the wild (just check StackOverflow...), it might result in fewer headaches all around.

vscode create a cross-environment(node and browser) buffer named VSBuffer.

here is the code. You can copy them to your porject.

you can use VSBuffer.wrap(arr).toString() to generage string , and use VSBuffer.fromString to parse string. It works in node and browser.

ept commented 3 years ago

@Steve-OH I take your point that providing base64 out of the box would be better than people copy-pasting a base64 implementation from stack overflow. Nevertheless, my instinct would be to leave base64 conversion out of Automerge in order to keep the API as small and simple as possible. Things that can sensibly be provided by external libraries shouldn't really be part of Automerge's API.

benjaminfayaz commented 3 years ago

Sadly @Steve-OH 's solution works only inside node.js and not on the browser. So I found this code: https://gist.github.com/enepomnyaschih/72c423f727d395eeaa09697058238727

When I modified the code to encode the Uint8Arrays to base64 and applied the changes it doesn't apply anything. This does not only seem to be the issue when encoding and decoding the arrays, but I couldn't even get it to work with applying changes directly on a test document. Maybe I am just using applyChanges incorrectly.

This code reproduces the issue for me:

const changes: Uint8Array[] = Automerge.getChanges(this.doc, newDoc);
let otherDoc = Automerge.from({ text: new Automerge.Text() });
otherDoc = Automerge.applyChanges(
      otherDoc,
      changes as Automerge.BinaryChange[]
    )[0];

The otherDoc does not have any changes applied.

skokenes commented 3 years ago

This is what I've been using in the browser for encoding messages (in typescript):

export function arrayToBase64String(a: Uint8Array) {
  return btoa(String.fromCharCode(...a));
}

export function base64StringToArray(s: string) {
  const asciiString = atob(s);
  return new Uint8Array([...asciiString].map((char) => char.charCodeAt(0)));
}

It has worked fine for us so far; anyone know of performance issues or bugs with this type of approach?

Steve-OH commented 3 years ago

@skokenes This can apparently fail if the array is too large (I don't know offhand how large "too large" is).

benjaminfayaz commented 3 years ago

Update on getChanges and applyChanges not working

... When I modified the code to encode the Uint8Arrays to base64 and applied the changes it doesn't apply anything. This does not only seem to be the issue when encoding and decoding the arrays, but I couldn't even get it to work with applying changes directly on a test document. Maybe I am just using applyChanges incorrectly. ...

Well, I didn't get it to work with getChanges and applyChanges but when using getAllChanges on one client and applyChanges on another it seems to work. I don't know if there are any downsides or issues with this approach (apart from growing network traffic and resource consumption). If someone has a solution with getChanges I'd be glad to see that.

I used @skokenes encoding and decoding for that.

This is what I've been using in the browser for encoding messages (in typescript):

export function arrayToBase64String(a: Uint8Array) {
  return btoa(String.fromCharCode(...a));
}

export function base64StringToArray(s: string) {
  const asciiString = atob(s);
  return new Uint8Array([...asciiString].map((char) => char.charCodeAt(0)));
}

It has worked fine for us so far; anyone know of performance issues or bugs with this type of approach?

Thanks for this snippet!

ept commented 3 years ago

Thanks for the discussion. It looks like the problem is solved so I am going to close this issue. Please re-open if I am missing something.

benjaminfayaz commented 3 years ago

@ept Yes, thanks for you help 🚀

I am still curious whether there is a solution using the getChanges method instead of getAllChanges, since that would reduce the growing network traffic with each change. But if there are no significant downsides with using getAllChanges this issue is solved for me 😃

ept commented 3 years ago

@benjaminfayaz Automerge.getChanges(oldDoc, newDoc) should work fine and return the changes that were added to newDoc since the earlier version oldDoc. In what way is it not working?

benjaminfayaz commented 3 years ago

@benjaminfayaz Automerge.getChanges(oldDoc, newDoc) should work fine and return the changes that were added to newDoc since the earlier version oldDoc. In what way is it not working?

I made a minimal example here: Edit happy-hill-wu5sn

I would expect the changes from the original doc to appear on the final one, since they both have the same initial state.

EDIT: Here is the code in case the sandbox doesn't work

const firstDoc: Automerge.Doc<{ cards: string[] }> = Automerge.from({
  cards: []
});

const newDoc = Automerge.change(firstDoc, (d) => {
  d.cards.push("some content...");
});

const changes = Automerge.getChanges(firstDoc, newDoc);

const anotherDoc: Automerge.Doc<{ cards: string[] }> = Automerge.from({
  cards: []
});

const finalDoc = Automerge.applyChanges(anotherDoc, changes)[0];

console.log("FINAL DOC:", finalDoc.cards); // does not have the changes from the first/new doc
pvh commented 3 years ago

Thanks for the example code, here's one that works as expected: https://codesandbox.io/s/magical-wu-7dbv8

What's happening in your first example is that the documents don't have a shared history, so the changes aren't applying like you expect. In your example, I would expect that there would be a conflict on .cards since both peers create an empty array independently of each other, but I haven't quite figured out why there isn't one. It seems to me that if this isn't working we should probably throw an error.

The question of whether to merge fields created independently but at similarly named locations is tricky but is an ongoing conversation. We want to avoid merging things that shouldn't be merged, but merge things that should... but how to tell? While you can see the problem of not merging here, Martin's JSON CRDT paper of a few years ago shows examples of what can go wrong if you merge too aggressively -- "torn" objects and incoherent combinations of types.

What my revision of your example is doing is having one "user" create a document and the other user load it. From then on both users will have a shared starting point and either one can make objects and share them back and forth.

Hope that helps get you on the right path.

benjaminfayaz commented 3 years ago

Thanks for the example code, here's one that works as expected: https://codesandbox.io/s/magical-wu-7dbv8

Thanks. You said this example works as expected and that one user loads another user's document, so that it has the same starting point.

I don't really see that in the code you provided. The output of your example does not really work. Is that code outdated or am I missing something?

Also thanks for the explanation of the similar named locations. I get that point, but why doesn't that apply when using getAllChanges and applyChanges on two different documents. They don't share the same history either.

EDIT: Example using getAllChanges Edit vibrant-lichterman-vucl9

pvh commented 3 years ago

Aha!

The getChanges call in your example code is comparing the current state of the firstDoc with the newDoc, both of which occur after the "initialization change" created by the .from().

You then submit that change with applyChanges which is perfectly valid, but has no effect. Because we want changes to be able to arrive in any order off the network automerge enqueues the change but can't apply it yet: its previous commit has not yet arrived.

You can see this by calling getUnappliedChanges() but of course, why would you think to try that?

This is definitely a bug: a user experience bug. I don't see immediately how to make this problem more obvious -- maybe applyChanges() should default to throwing an error for unapplicable changes and we should have a new call like enqueueChanges()?

This deserves more thought, because I'm positive you are not the first or the last person to run into this issue.

(EDIT: As for the code, I totally failed to save the changes and you saw some broken state. User error, no doubt. Anyway, I updated the code to illustrate the problem. Here it is: https://codesandbox.io/s/magical-wu-7dbv8?file=/src/index.ts )

benjaminfayaz commented 3 years ago

@pvh Thanks for the answer. Your codesandbox was also very helpful.

Do any of the problems that you described regarding my approach still apply to the code you provided? Because even when modifying your code it seems to work perfectly fine 😃

pvh commented 3 years ago

Should be fine -- we shouldn't have let you get into trouble in the first place but I'm not sure quite what the fix ought to be yet.

ept commented 3 years ago

@pvh If you have a moment, could you please write up a quick summary of the problem in a separate issue (fine if there's no proposed solution right now), and then we can close this one?