Bug: lexical-yjs create nodes using constructor without args

GermanJablo commented 8 months ago

The createLexicalNodeFromCollabNode function creates a node using the constructor without arguments, and then synchronizes the properties using collabNode.

This can be fatal if the constructor requires arguments. It is dangerous to call a constructor in a way that was not designed by the user.

Lexical version: 0.12.0

Steps To Reproduce

Use a node whose constructor throws an error if an argument does not exist with collab mode.

The current behavior

Error is thrown.

The expected behavior

No error is thrown.

fantactuka commented 8 months ago

This one would be tough to resolve, basically collab plugin needs to reconstruct node based on bag of properties without knowing anything about constructor, so some sort of serialization/deserialization would be required. We already have import/explort JSON, but the risk here is that it was not initially used by collab so existing node implementations might not include some of the fields in resulting json causing data loses in collab

GermanJablo commented 8 months ago

It makes sense that the contract is given by serialization in JSON or HTML.

but the risk here is that it was not initially used by collab so existing node implementations might not include some of the fields in resulting json causing data loses in collab

That's already true either way right now. If someone forgets to serialize a property, they won't be able to save/load its editorState, or copy/paste, or import/export, etc.

JorgeAtPaladin commented 5 months ago

We have the same issue syncing our custom nodes. Can the correct way of syncing please be explained? We assumed the content would be serialized and the diffs would be deserialised or something.

JorgeAtPaladin commented 5 months ago

Also- does this even work with LexicalEditor properties?

Eg. https://github.com/konstantinmuenster/lexical-examples/blob/6ed2e5bd6e23d6afcd4c7320b7a5778b94878861/examples/lexical-nested-editor/src/Editor/nodes/Comment/CommentNode.tsx#L37

Would this CommentNode even be able to sync?

GermanJablo commented 5 months ago

2 solutions come to mind:

A way to automate the serialization of any node, like this https://github.com/facebook/lexical/pull/3931
Add importYjs/exportYjs methods.

I think solution 2 is unavoidable, since even in the PR I mentioned, import/exportJSON methods can still be overwritten due to occasions where properties are not serializable.

Also, I think importYjs/exportYjs is something that could be used for decorators. For example, if someone wanted to integrate Monaco or Codemirror and have a working undo manager. See: https://discord.com/channels/953974421008293909/1163567808823705732

JorgeAtPaladin commented 5 months ago

@GermanJablo big fan of generic but overridable import/exports. The importYjs/exportYjs optional functions which fallback to the JSON make a lot of sense as well.

The WIP you mentioned is from Feb, is this stale?

GermanJablo commented 5 months ago

Yeah. There is a similar PR with a "generic" clone that was ready to be merged but it seems the Lexical team hasn't had the time.

As soon as that PR is merged I can update the import/export one.

JorgeAtPaladin commented 5 months ago

@acywatson , would love to get these PRs merged. Is there anything blocking this?

Having these updates would be a big benefit to the dev UX for Lexical and as discussed above would furthermore set the starting ground for fixing yjs support for many custom decorator nodes.

acywatson commented 5 months ago

@acywatson , would love to get these PRs merged. Is there anything blocking this?

Hey - when you say "these PRs" I assume you mean the work around generic JSON serialization/deserialization.

The only thing holding this up is that it's a big invasive change and landing it means that we (core) have to devote resources to monitoring and fixing any issues it might cause across all the Lexical implementations at Meta (not to mention all the external dependents). As you might guess, there are many such implementations and the consequences of a bug in some of them can be quite significant.

I've been lobbying to get this merged for a while now, and I think we will. In principle, we've agreed it's an improvement. Logistically, it does present some challenges.

Hope that's helpful.

GermanJablo commented 5 months ago

Contributing to the traceability of this discussion.

In the past there have been several issues and PRs seeking to determine which properties should be excluded in yjs.

For reference, see:

Errata:

Something I'm thinking is that instead of defining importYjs or exportYjs methods, perhaps we could use importJSON and exportJSON as the intermediate bridge between the lexical node and the yjs node, so that the user does not have to code another serialization. This way the constructor would not be used, which is not reliable. In fact, in the discussion a few days ago that I proposed import/export yjs I had forgotten, but now I see that this had been my initial idea in the comment above:

It makes sense that the contract is given by serialization in JSON or HTML.

danielblignaut commented 4 months ago

On this same topic, if you set excludedProperties on your node type, those properties are not synced. When there's a change in that node and yjs syncs the node, does it basically set the current value of the "excludedProperty" to undefined? or will the value of the "excludedProperty" remain the same local value from before the sync update?

GermanJablo commented 4 months ago

Update: I correct myself again.

I think ideally a decorator would actually have an export/importYjs method. In case it is not user defined, it could fallback to export/importJSON.

The reason is that currently if one attempts real-time collaborative editing in a decorator, what happens is that the state of the decorator is completely overwritten. For example, you cannot see another user's cursor in a table cell. The same would happen if a text editor such as Monaco or Prosemirror were implemented. It wouldn't be collaborative editing, but rather LWW, which has noticeably inferior UX.

A way for the decorator to expose its underlying YDoc would allow it to synchronize its data structure more atomically.

yf-yang commented 1 month ago

Hey I agree using serialized JSON would be a better approach (especially when taking versioning into consideration) , so I decide to implement one personally (and publicly). Now I'm considering the following issue:

TextNode and ElementNode should be treated differently, because we want to implement the diff correctly. (Easy to resolve, just mention it here).
Nested editors needs to be treated differently. I think current Lexical implementation is wrong because nested editor should not be implemented as a sub doc. However, I need to investigate how to make it work

Taking those stuff into consideration, I expect the implementation would be somehow ugly, but I don't expect it being merged to the main branch, so I'll directly add new special properties to resolve the second issue and do not consider backward compatibility.

matsuyama-k1 commented 2 weeks ago

I agree using serialized JSON would be a better approach too.

I'm encountering an issue where YDoc(Yjs document) is excessively large, so I am looking to reduce the document size by shortening the names of nodes. For example, I want to change a long variable name like someNode.isGeneratedByBackendAndUntoutched to shorter one such as isg. However, directly shortening the variable names makes the code less readable.

Therefore, I would like to shorten the variable names only when exporting to JSON, to minimize the size of the YDoc.

yf-yang commented 1 week ago

After writing some codes, I found a big drawback of using exportJSON/importJSON: test whether or not a node has changed.

Let me briefly describe what would happen:

Suppose a remote yjs change is synced to current client.
A node has changed, so we call importJSON and creates a new node.
We need to replace the previous node with the new one. Here the node is marked as dirty.
After the yjs transaction finishes, Lexical reconciles everything, then after Lexical is done, an update listener is triggered, and a new yjs transaction has started.
The node is marked as dirty, we don't know what happens, so we call exportJSON.

Now, since every time exportJSON creates a brand new object, the new JSON and the old JSON are different. Therefore, we have no idea if the yjs representation of the node should change.

I don't know if this can be easily resolved. I tried to compare the JSON's values with their existing Yjs storage. If the values are primitive, or the values are directly imported and exported by the Lexical node, then this test is still fine.

However, it is possible that users create a node which dynamically creates a new object value when calling exportJSON. For example, for those nodes containing a nested editor, this._someNestedEditor.toJSON() is called when exportJSON is called. So we can imagine, in these cases, we have to update yjs representation, then a remote client would have to call importJSON, then the change is sent back ... infinite loop.

Take a step back, in original collabNode implementation, all the properties of a node are directly compared, if they are different, then the corresponding yjs data is updated. So it seems the assumption that those properties are static (apart from nested editors) exists for a while.

I personally cannot come up with better solutions, so maybe we need to follow that assumption.

yf-yang commented 1 week ago

My bad, just noticed if tag has collaboration, then the change is skipped.

Needs some final debugging to make the whole stuff work.

yf-yang commented 1 week ago

https://github.com/yf-yang/lexical/tree/refactor-yjs

To whom it may concern, I've already refactored most of the parts. Unit tests are passed. Played with the playground on my own and it works fine.

I haven't worked on nested editors (I really hate nested editor design, maintaining multiple editor states simultaneously is really really really BAD). Needs suggestions on how to make them work when using JSON serialization methods.

facebook / lexical