Open simonlast opened 8 years ago
This is something Prosemirror has gotten working, and they have a similar method of handling documents from what I can immediately tell (immutable data structure decoupled from the DOM).
Thanks for starting the conversation on this. Collaborative editing is something I've thought about quite a bit. Since we haven't had a need for it at Facebook yet, I haven't tried building anything to make it happen.
I've considered your suggestion of exposing the operations themselves, but part of the issue there is that we don't necessarily have information about the operations. For instance, with spellcheck handling, we throw out the old value for the selected text node and replace it with the new value. Since we're using immutable states, the actual delta in the text is unimportant for our state management, and is not tracked.
Additionally, Modifier
methods serve to wrap multiple transactions -- we would need to record each transaction in a list to make the appropriate change fully available.
One option I have considered is the Quip approach of locking blocks that are being edited by others. In this way, as the remote user modifies the locked block, the full state of that block can be sent at intervals. (Live per-character changes aren't especially important if the block is locked.) It's a heavier payload to send a whole block, of course.
I think it would also be possible to identify deltas between ContentStates (or individual ContentBlocks) for transmission, after transactions have already been performed.
It seems like for the spellcheck case we could do a "diff" of the old and new strings within a block to infer the change – even just stripping off a common prefix and suffix, so
-writing collaboritivly using draft.js
+writing collaboratively using draft.js
would turn into essentially
writing collabor[-itiv-][+ative+]ly using draft.js
which we could more easily represent as a state "transaction". I personally get frustrated by Quip's "block locking" and much prefer how Google Docs lets two people edit one paragraph.
@spicyj: Yeah, I think that can be done in the higher-level component via ContentState
diffing, though I haven't tried it. I mentioned spellcheck as an example of where we would have to add something new to perform diffing internally within the handler, since we don't currently need it. I veto adding complexity to the event handlers for this. :)
(Historical note: affix checking was how we originally handled spellcheck, via a MutationObserver within the core component. MO worked okay for the most part, but was a huge headache for handling Apple autocorrect.)
One thing I wonder about but haven't investigated in Google Docs etc. is how undo/redo works during collaborative editing. If I make a bunch of changes that someone else then modifies, what happens if I try to step backward?
MO worked okay for the most part, but was a huge headache for handling Apple autocorrect.
Why was that?
I just typed
hello, here is a banana
in Google Docs then backspaced "banana" and changed it to
hello, here is an orange
in another window. Then undo in the first window gave:
helln orange
n orange
then undo in the second window gave
<-- (nothing)
banana
So it seems like each window has its own relatively-independent undo stack that reverses each "editing transaction".
"...but was a huge headache for handling Apple autocorrect"
What was that specifically? @hellendag
After watching your talk @hellendag, it seemed like the way the cursor is tracked is mapping fairly cleanly to a good model for doing operational transform, differential sync, or google's diff-match-patch algorithm.
I'm going to be experimenting this weekend with it.
What was that specifically? @hellendag
Check out https://jsfiddle.net/salier/WWagu/3/ to play around with it.
The characterData mutations for a regular spellcheck look something like this:
'testt' -> 'test'
'' -> 'test'
We could use the old value for the first mutation as the basis for the affix logic to figure out the diff.
For autocorrect occurring when inserting a space after a misspelling, it looks more like this:
'testt' -> 'test '
'' -> 'test '
'test' -> 'test '
These are kind of trivial changes, and IIRC it got messier with multi-word autocorrects.
So we actually used guesswork to handle it within our diffing logic: assume 2 records for spellcheck, 3 for autocorrect, then try to evaluate the appropriate outcome from there. Since records are batched differently for spellcheck and autocorrect, we also wrapped everything in a requestAnimationFrame to ensure that related records would be batched together. This didn't always work.
The browser provides no events to indicate that spellcheck/autocorrect will occur or has occurred. All you can really do is put up with MO or listen for input
(which we do now), and try to figure things out from the state of the world.
Just wanted to say +1 to this.
I think it's one of the most important pieces to unlock in core to make Draft.js a complete (and amazing) solution for building editors. Since the current core is so flexible, most other features can be created entirely in "userland", but this one would need a bit more work in core to make happen. And it would end up being amazing.
@rgbkrk Have you come to a conclusion on which of the algos you've mentioned might be most suitable? I'm just getting started researching this and I'm about to read the pages you've linked to. Personally, the first thing that came to mind was CRDT which was implemented in this project: https://github.com/ritzyed/ritzy
He goes into a decent amount of detail regarding his implementation here: https://github.com/ritzyed/ritzy/blob/master/docs/DESIGN.adoc
One option I have considered is the Quip approach of locking blocks that are being edited by others. In this way, as the remote user modifies the locked block, the full state of that block can be sent at intervals. (Live per-character changes aren't especially important if the block is locked.) It's a heavier payload to send a whole block, of course.
@hellendag Hi. I don't know so much DraftJS so far but I'm interested to know. Do you think this approach can be implemented in userland? For example can it be possible to add a class to locked blocks, and refuse new inputs only on these blocks?
I'm interested by this because I need the annotate/highlight like Medium does. They mark every top-level text node (p, h1, h2, pre, code...) with a name, and their annotations are positionned with (nodeName,offset). This way it's easy for them to keep sanity in their annotations even after text edition (while plugins like AnnotatorJS, based on XPath strings, tend to fail hard after edits). So I'll probably already have named blocks, and I'm ok to not have the shiniest CRDT algorithm and locking text nodes is fine for me.
@adrianmc a CRDT is an easier option to implement than an OT based solution, however it might not be suited with the diff->patch->match approach unless the diffs are very aggressive. The problem being that the CRDT keeps all changes in it including removed characters and so can quickly grow even if you're only doing single character changes. One solution is to change the way a diff occurs, we know that text in an editor happens in a contiguous block, so we can diff based on characters and compare what's in the underlying model with what is physically in the dom. A simple approach is to step forwards from the start of the two strings and step backwards from the end until the first changes are found on each side and we can isolate the change block which is a bit more efficient.
@hellendag I'm guessing that google docs uses the command pattern rather than storing the immutable state, this approach would make the most sense for a CRDT where you don't want to delete a character if you undo creating it but simply tombstone it.
In an OT based solution you could store and replay the operations that have occurred since the object was in the state that is triggered by the undo.
I am currently working on a project where document editing is an important part and collaborative editing is a requirement. Do anybody here have experience here whether it can be implemented with draft-js or I should look for another solution?
@gaborsar you should look for another solution, prose mirror and ritzy both support collaborative editing.
@disordinary Thank you for the reply! I was afraid of that. Draft is nice, I have already integrated it in the first prototype. The option to have custom block components would be quite important to me, but without a delta based saving I have to rethink my decision... Thank you for the advice as well!
@gaborsar The thing is it's not a great need for most people, so i've found that if you want it done you just kind of have to do it. Ritzy is built on a particularly inefficient CRDT and so i'd assess it carefully for your use case, it will be fine for small datasets and short editing sessions but you wouldn't want to write a novel on it. ProseMirror might not be as flexible as you want.
I'm experimenting with mobiledoc-kit to see how I can build collaboration on that platform, but it really is extremely early.
@disordinary I have to plan with approximately 100+ pages long text elements. I think I have to read Medium's article very well then... :D
@hellendag I have one / two questions about the suggestions you had before:
The Operational Transformation Wikipedia page lists some papers about implementing undo in OT. I haven't checked them out yet.
So I am wondering why no one here is talking about Google's Realtime API (https://developers.google.com/google-apps/realtime/overview#watch_the_video_overview). Their way of thinking about mutations is surprisingly similar to @hellendag's talk. It's also interesting that they solve the same network layer problem that GraphQL does.
So, at the end of the day, I am super interested in continuing to do mutations through GraphQL (w/ server AND p2p?), but keep Draft's power. Else I'd need to stitch Draft AND Google Realtime API, and keep mutating the server through timeouts and GraphQL too - but turn off the subscription to update state from the GraphQL's server connection when collaborating. Not neat!
Sorry I know this doesn't add new ideas, but just thought I'd share what's going on in my head since I have been following this post.
@varunarora For a lot of using a google service is not an option to store the data. Also, there are a few other OT implementations that you can use (Apache Wave, ShareJS - implemented by an ex-Google-Wave guy). On the other hand, OT is not the only model to do collaboration, block level locking on selection changes would be a good starting point too (like Quip does it). The responsibility of Draft here would be to provide an API to catch every mutation and allow us to implement any kind of collaboration. Of course, this is just my opinion.
@gaborsar Oh absolutely. I don't WANT to use a Google service here. So, thanks so much for telling me about these other OT implementations! Much appreciated. I will try them out, because I think my users won't like block level locking too much.
Yeah I don't know about the scope of Draft's existence, but I actually think because so many people care for this use case, it should be a collaborative add-on/mixin that someone builds exclusively for the Draft world. Like react-router for react.
The ProseMirror solution is almost like OT but operations have to be applied in order. It works more like a rebase. If I understand it correctly the client attempts to send its changes to a central server. If the changes fail then the client gets the latest changes, rebases its changes on top of them, and tries again. All the while it is receiving a stream of successful changes and rebasing buffered changes on top of those. The client only attempts to send one change at a time, buffering multiple changes into one change if necessary.
This is possible to implement with a transactional database that checks the version of the document it is applying incoming changes to. The problem I see here is that it would be potentially harder to scale than OT, which is theoretically easier to horizontally scale.
I am interested in collaborative editing as well. Prosemirror is great, but in the near future CKEditor 5 (or alloyeditor 2 which would be based on it) seem to be good contenders.
However is collaborative editing really not an option with draft js ?
Is @gaborsar 's approach of scanning + diffing draft state really that bad, performance wise ? I would have thought the immutable state of draft.js would help to determine quickly which parts have changed.
Then we need a way to implement block level locking...
Just my two cents -- I am 100% against block-level locking, however much simpler it may be. Google Docs set a precedent of how collaborative editing should work and anything less than that feels janky and half-baked.
OT is 100% not doable with draft.js (currently). I would like to explore some alternatives, even less user friendly than Google Docs.
@n-scope If the mutations would be accessible OT could be implemented the following way:
Of course, this would be way too complicated, and undo/redo would be hard to implement too.
@n-scope: Let say you have a document that is a few hundred pages long (for example a scientific paper). It contains a lot of images, styling, and custom elements (e.g. mathematical expressions) too. Currently if you change the cursor position draft triggers a state change event. Therefore the theoretical diff algorithm would scan the complex document after every cursor position change, and even if it would be very fast, it is a lot of wasted computing power, and I would not expect it to provide proper realtime experience. Of course I might be wrong, and of course for shorter documents it could be good enough. In case React is not a requirement for you Quill can be good enough.
@gaborsar: I agree with you, scanning is not a solution that scales well. But my typical use case is a one page document. One potential optimization would be to restrict the block diff to the previous selection/cursor position. Another one would be to perform batch updates.
Thanks for the Quill link.
I'm highly interested in having live collaboration inside draft. Have you looked at using a state-based CRDT model instead of OT? I know very little about CRDT, but if I understand the concept correct it shouldn't require each individual change, it just continuously merges the different versions. CRDT should converge as well or better than OT, but I don't know of any specific implementations of state-based CRDT. It's at least something to look into.
@eliwinkelman I did look into those, and actually there is not much difference between state based and operation based CRDTs, and if I remember correctly somebody already proven that every state based CRDT can be implemented as an operation based and vice versa. In general my experience with CRDTs is that they generally slow (linear or worst complexity for each character insertion or line insertion if you would like to support that only), most CRDTs does not support undo/redo, etc. So there is a lot of issue with them, and I do not know any rich text editor in production that uses one. And, even if you would like draft to have a CRDT as it's data model that would mean a rewrite of draft, as the current state is very far from that.
@gaborsar I saw the same thing with the CRDT's, it seems like the code implementation is different, but the end results are equivalent. The slowness and lack of undo/redo aren't things I saw mentioned, and that kind of kills the idea.
What about something similar to what prosemirror does (in the general sense, without the deltas). The server copy has a version number, clients store the number they are making changes on. When a client sends they're new copy (w/ changes) if the version numbers still match it becomes the new version. If they don't, the newer version is sent back to the client, and is merged with the clients copy, and sent back to the server. For large documents you shouldn't need to merge the whole thing, you could do it by content block.
I probably missed something, but I'll probably be looking more into this (w/ draft js and alternatives).
@eliwinkelman The most common CRDTs for text editing are:
I think logoot-undo can be a good candidate for draft, but making the key generation right seems to be the most difficult issue. https://pdfs.semanticscholar.org/75e4/5cd9cae6d0da1faeae11732e39a4c1c7a17b.pdf
I am not sure about Prosemirror but it seem to require manual conflic resolution (I might be wrong about this, so do not accept blindly what I write):
ProseMirror's collaborative editing system employs a central authority which determines in which order changes are applied. If two editors make changes concurrently, they will both go to this authority with their changes. The authority will accept the changes from one of them, and broadcast these changes to all editors. The other's changes will not be accepted, and when that editor receives new changes from the server, it'll have to rebase its local changes on top of those from the other editor, and try to submit them again.
@eliwinkelman Just to make it a bit more clear: Draft is pretty close to logoot, as it have lines (blocks) with keys. To make it work the key generator function have to be replaced with one that complies with logoot's requirements, and a remote insert/remove integration code have to be added to the editor state. If these two modifications are done react could support logoot, but the current undo redo still would not work (currently draft replaces the current state with a previous version), as for example going back to an older version of a line requires a new key for that line to make it "look like" a remove + insert combination.
@gaborsar logoot-undo looks like a good fit. One thing to consider is that the paper you linked to (which was very good) describes an implementation of undo that allows a peer to undo other peers actions, from a ui/usage perspective, that doesn't seem like the general use case. In general I'd think that you'd only want a peer to undo their own actions. This still has the same problems you mentioned, however the implementation would be slightly different than described in the paper (and likely simpler).
When using collaboration, standard undo/redo could be disabled and a separate logoot friendly version could be called on. This would allow a collaboration friendly undo, without touching the current implementation.
Another thing is spellcheck, which was mention by @hellendag as something that draft doesn't keep track of changes for.
Looking at this it doesn't look like prosemirror requires manual conflict resolution, however it does need to occassionally drop some changes.
@gaborsar I realized that using logoot at a contentBlock level wouldn't work well. Concurrent changes to a contentBlock wouldn't be supported. However, it could be used a character level. This presents the same problems as above with tracking changes to the document, this could be done with keybindings outside of draft core, but would need to cover every use case. Additionally, storing logoot friendly keys doesn't need to be done in document, it can be done separately (the i'th key corresponds to the i'th character/line).
Note: This was published by mistake once because I slipped with my thumb. Sorry!
Just to throw in my two cents here as someone that's not a Draft.js user, but who has experience building a collaborative text editor and has been following this thread.
The original version of Canvas used ShareJS, where the document model was a big rich text string that we ran text OT on. It worked okay, and we faked undo by storing document versions locally and generating a diff when a user undid.
The current version of Canvas in private beta uses ShareDB, where the document is a collaborative list of blocks, each of which contains collaborative metadata, and a collaborative content string or sub-collaborative list of blocks (e.g. a checklist). This works very well for us and implementing undo/redo was straightforward once we got over a little hurdle in understanding.
We spent time considering CRDT solutions (specifically CmRDTs) like Swarm.js and Logoot. Ultimately, we were attracted to ShareDB's mostly out-of-the-box solution and like the fact that our data is stored as plain JSON in the database and is therefore easily queryable. There would be a little extra work to do this with a CRDT. We were also a little wary of the potentially large size of documents if we were to do a character-by-character CmRDT like Logoot. If OT didn't exist as an out of the box solution, I think we would have looked into Causal Trees (like Swarm).
This is kind of just a brain dump, but I thought maybe my experience might help shine some light on the subject. Our editor is Ember.js based and is architected pretty differently from Draft.js as far as I can tell.
@jclem We had a very similar journey! :) We tested woot and logoot on a prototype level, but woot is generally slow (linear) and logoot keys on character can level grow way too fast. At the end we did land at ShareDB too, and implemented our own editor (React), and we did borrow a few ideas from quill for it. My best advice for anyone here: if having react components in the editor (custom blocks, wrappers, etc...) is a not a must have, than quill is a good editor and support collaboration with ShareDB out of the box. I do not think that draft can be forced to do comparable real time collaboration, even if I still think it is a nice project.
@gaborsar Out of curiosity, what keeps Draft from doing that kind of OT?
Edit: Ah, re-reading the thread it looks like it's because a full document diff would be necessary for each operation because of how Draft's events are done. That's too bad 😦 If that's the case, it seems like neither OT nor any CmRDT are options. CvRDTs (state-based) might work, but that's very inefficient unless you implement a delta CvRDT, in which case you may have the same problem.
@jcelm
Draft has a document model that is pretty far from the existing rich-text OT implementation of ShareDB. Quill on the other hand has been designed together with that. The rich-text type of ShareDB has been created by the quill team.
In my opinion it would require a lot of work to upgrade Draft to support ShareDB the same way. It would either mean a new OT type designed for Draft, or replacing of the model. In either case, the undo / redo, the modifier utils, the rich text utils, the selection handling and lot of other thing should be pretty much rewritten.
For what it's worth, since I've been following this conversation since I first started experimenting with Draft. I've actually been building a Draft-like library for React editors that is less opinionated, and architected in a way that is more OT-friendly (all transforms as operations, undo/redo replays the operations, etc.). There is still some more work that needs to happen to have OT be super easy to add straight from core, but it's close enough that if anyone was interested in contributing to help get it there it wouldn't be an near-unsurmountable problem, like trying to convert Draft to OT might be.
If anyone is interested, it's called Slate — https://github.com/ianstormtaylor/slate
I hadn't realized that Sharedb and quill js worked together, I looked at sharejs + quilljs but quill js changed their api going into 1.0 and no longer supported the json otype, I didn't see the new Rich-Text type. I think I'll be switching to that.
@ianstormtaylor good luck with slate.
I might have missed something in the discussion but can someone explain why the following isn't possible:
I suppose DraftJS already makes state diffs between previous and next EditorStates
. Why can't DraftJS handle a onDiff
prop, similar to onChange
but that would expose these diffs (and probably the prev/next state along them). This would allow the diff to be converted to whatever OT/CRDT format you use.
Am I stupid or as this already been proposed somewhere?
@ArnaudRinquin There are no diffs in Draft. Draft has modifier functions that are responsible to create new versions of the state (Modifier and RichUtils modules).
Making the selected text bold in Draft:
const editorState = RichUtils.toggleInlineStyle(this.state.editorState, 'BOLD');
this.setState({ editorState });
Making the selected text bold in Quill:
const delta = quill.format('bold', true);
Generated ShareDB / rich-text friendly delta assuming that the characters 10..15 are selected:
{
"ops": [
{
"retain": 10
},
{
"retain": 5,
"attributes": {
"bold": true
}
}
]
}
@ArnaudRinquin Quote from @hellendag https://github.com/facebook/draft-js/issues/93#issuecomment-189059975
I've considered your suggestion of exposing the operations themselves, but part of the issue there is that we don't necessarily have information about the operations. For instance, with spellcheck handling, we throw out the old value for the selected text node and replace it with the new value. Since we're using immutable states, the actual delta in the text is unimportant for our state management, and is not tracked.
@gaborsar Oh I know how to apply changes to the EditorState but I assumed that, at some point, DraftJS would diff the prev/next EditorStates to detect and apply the changes (which would go against React way of doing things, I agree).
Can you confirm that there is no EditorState diff made by DraftJS, and that applying changes only rely on React ?
Edit: just saw your quote from @hellendag
@ArnaudRinquin Commented the same time, see above ^^ :)
@gaborsar I am not sure I understand @hellendag's quote perfectly. Especially this sentence:
Since we're using immutable states, the actual delta in the text is unimportant for our state management, and is not tracked.
Would it be possible to adjust and make these concerns important and tracked so we can expose operations? Or is it just impossible ?
It would be great if this library exposed abstractions required for collaborative editing.
One way to do this would be to expose an event handler that, instead of emitting the entire new state, emitted a transaction object that described how the state changed. An implementer would then need to have a collaborative data model, which would then ingest the transaction objects.
Another way would be for Draft.js to represent its state using a collaborative data model (CRDT for example), and be able to both emit and ingest transactions. This would obviously be a lot harder.