FXMisc / UndoFX

Undo manager for JavaFX
BSD 2-Clause "Simplified" License
99 stars 17 forks source link

Add support for directed acyclic graph undo/redo #8

Open JordanMartinez opened 8 years ago

JordanMartinez commented 8 years ago

Per TomasMikula/RichTextFX#233.

JordanMartinez commented 8 years ago

The UndoManager interface could be split up into two interfaces LinearUndoManager and NonLinearUndoManager, but then there's no common interface between the two so that one could switch one out with another within the same object (an issue which will arise in RichTextFX's StyledTextArea).

So, it would be best if the UndoManager interface could be expanded to also support non-linear undos and redos. For example, 5 views display a different portion of the same document. Each view tracks their own edits separately. If a user undoes some action in one view, it can only be redone within that same view. Trying to redo an action in some other view will do nothing.

To expand UndoManager in this way, we would also need to include an argument to keep track of which source (e.g. which view in the above example) is calling the corresponding method and thus which ChangeQueue to use to undo/redo the stored action.

Then, we'd need to rename UndoManagerImpl to LinearUndoManager and just pass a null argument as the source. Additionally, we'd need to write NonLinearUndoManager that uses a generic to determine which type of object to use when tracking the source.

JordanMartinez commented 8 years ago

I'm not familiar with how one implements a Directed Acyclic Graph, so that's to my disadvantage....

Still, I'll attempt to piece together how this should work.

Using arrows, let's say there are 2 areas, each of which have made the following changes where the order of changes is alphabetical (change A comes before change B):

Start <-- A <-- B <-- C <-- D <-- E <-- F <-- G <-- Finish

Order: "Content of Document at this step" (Area that did action: the action that was done)
============
Start: "initial text"
A: "initial" (Area 1: deleted " text")
B: "initial stuff" (Area 2: inserted " stuff")
C: "stuff" (Area 1: deleted "initial ")
D: "initial stuff" (Area 2: inserted "initial ")
E: "initial good stuff" (Area 1: inserted " good ")
F: "initial good stuff with oreos" (Area 1: inserted " with oreos")
G: "initial good stuff with oreos and milk" (Area 1: inserted " and milk")
(Finish): "initial good stuff"

Area 1's Undo History: 
- undo G (delete " and milk")
- undo F (delete " with oreos")
- undo E (delete " good ")
- undo C (insert "initial ") - can't undo until D is undone
- undo A (insert " text") - can't undo until B is undone

Area 2's Undo History:
- undo D (delete "initial ") - can't undo until E is undone
- undo B (delete " stuff") - can't undo until C is undone

With RichTextFX, only one change at a time can be applied to the underlying document shared by multiple clones. However, the above example is still very linear: to get back to "Start", one would need to call undo() on the correct area in the correct order. This is not only error prone but also ridiculous and not what we desire anyway. Furthermore, it is still possible to undo D and B before any of the other later actions are undone: "initial " and "stuff " can still be deleted without needing to undo any other action before their deletion.

So, the change stored would need to know what conditions are needed for it to be considered "valid" and "invalid." For example, D and B are still "valid" because they can still delete their corresponding text. Therefore, a change could keep track of the respective text to delete and its actual location. Thus, D would not only know to delete "initial " but also that "initial " was originally the range [0, 7].

But how should changes in other areas be accounted for?

Thus, despite changes from other areas, the undo would still function correctly. However, if some new text is inserted in-between it, the change would no longer be valid. For example, what should happen in the following scenario? The user types some text in one area ("text"), then inserts text in-between that first insertion using a second area("te___xt"), and then calls undo in the first area.

In the above scenario, the change would not be "valid." However, if the user, whether via the second area or perhaps a third area, delete the text he/she inserted in-between the original text, that change is now "valid" again. Thus, if the user calls undo() in the first area, the inversion of the change should still work.

TomasMikula commented 8 years ago

Let the edit graph define as an acyclic graph where vertices are edits made to the document, and there is an edge b -> a iff edit b was made after edit a and overlaps with it.

Let's call edits a and b mutually independent iff there is no directed path from a to b and no directed path from b to a.

Then edits that can be undone are the ones with no incoming edges.

When an edit is undone, all edits mutually independent with it must have their ranges updated.

  • Should the characters of the original text be deleted but the second area's insertion left untouched ("__")?
  • Should the adjusted range (which now includes the recently inserted text from area 2 and the original text) be deleted ("")?
  • Or should this change just be considered "invalid" and skipped over and the next change valid applied("te__xt")?

What I described above would give rise to the third behavior—the change would not be undoable at the moment, because another change depends on it.

There might be some attempt to bubble up a change so that it becomes undoable. This would try to recalculate changes as if they occurred in a different order, but yielding the same result. For example, to bubble up change b in this graph (assume all edges are oriented top-to-bottom)

c d e
 \|/
  b
  |
  a

would result, if successful, in graph

  b
 /|\
c d e
 \|/
  a

in which b can be undone.

JordanMartinez commented 8 years ago

(I'm restating your words, so I understand it myself better). So in your above edit graphs... (I've reversed the direction so I can use the "^" character)

Earlier edits
|
|      a
|      ^
|      |
|      b
|     /|\
|    ^ ^ ^
|    c d e
|
Later edits

a is the first edit done and b follows a because it points to a. c, d, and e can all be undone at the current moment because nothing points to them. Additionally, either one can be undone because they don't point to each other. However, b, in this current graph, cannot be undone because the others point to it.

Given this graph:

Earlier edits
|
|      a
|      |\
|      | \
|      |  \
|      |   \
|      |    \
|      |     |
|      ^     ^
|      d --> e
|
Later edits

Despite pointing to the same object a, e is not mutually independent with d because d points to it. Thus, d is the only edit that can be undone because it's the only one without something pointing at it.

Now, returning to the first graph, there might be a time where one wants b to be undoable, even if it isn't in the first graph. Thus, we can "bubble b up" (essentially reorganize the graph) and thus make b undoable. For example, the first graph could be reorganized to the following one in which b has nothing pointing to it:

Earlier edits
|
|      a
|     /|\
|    ^ ^ ^
|    c d e
|     \|/
|      ^
|      b
|
Later edits

Now, however the graph is reorganized, the final state of the content must be the same as before. So, if the code tries to reorganize the graph in all its possible ways, and yet it cannot end up with a graph where the final state matches the original graph's state, then the desired edit cannot be "bubbled up".

TomasMikula commented 8 years ago

Now, however the graph is reorganized, the final state of the content must be the same as before. So, if the code tries to reorganize the graph in all its possible ways, and yet it cannot end up with a graph where the final state matches the original graph's state, then the desired edit cannot be "bubbled up".

The reorganizing will not be just changing the arrows and adjusting ranges of the edits, but also changing the content of the edits. (If it was just adjusting ranges, the edits would have been independent already.) With allowing modifications of edits, a reorganization yielding the same final document will always be possible, but not always meaningful. What are meaningful bubbling operations remains to be determined.

JordanMartinez commented 8 years ago

Could you give one example of a meaningful bubble operation? I understand how the idea works but don't understand its relevance.

The only example that comes to mind where a bubble operation would be useful is the following graph:

Earlier
|
|   START
|     ^
|     |\
|     | \
|     |  \
|     |   \
|     |    \
|     a     b
|     ^     ^
|     |    /
|     |   /
|     |  /
|     | /
|     |/
|     c
|     ^
|     |
|     d
|
Later

Let's say Area 1 made edits a, c, and d. and Area 2 made edit b. If I wanted to call undo() in Area 2, nothing would happen because c and d must be undone first. Thus, edit b would need to "bubble up" to the top so that it is undoable:

Earlier
|
|   START
|     ^
|     |
|     |
|     |
|     | 
|     | 
|     a
|     ^
|     |\
|     | \
|     |  \
|     |   \
|     |    \
|     c     |
|     ^     |
|     |     |
|     |     |
|     |     |
|     |     |
|     |     |
|     d     b
|
Later
TomasMikula commented 8 years ago

I'm going to give an example with a much simpler edit graph:

START
  ^
  |
  a
  ^
  |
  b

Edit a was made by area 1, edit b by area 2. Area 1 wants to undo, so we will try to bubble a to the top (depicted at the bottom ;)) to get

START
  ^
  |
  b'
  ^
  |
  a'

Let's say the initial text was stuff.

Edit a changed that to some great stuff:

a = { pos: 0; removed: ""; inserted: "some great " }

Edit b changed that to some mediocre stuff:

b = { pos: 5; removed: "great"; inserted: "mediocre" }

A meaningful bubble up operation could result in new edits being

b' = { pos: 0; removed: ""; inserted: "mediocre " }

a' = { pos: 0; removed: ""; inserted: "some " }

When b' and a' are applied, in this order, to the initial text, we get the same final text, some mediocre stuff. a' can now be undone, to get mediocre stuff, and we can somewhat meaningfully say that that state corresponds to applying b as if a was never applied.

(Note that I didn't give you an algorithm how to compute b', a' from a, b. I just gave an example of what could be the result of the algorithm in one instance. I don't think a general algorithm with meaningful results is in any way trivial.)

JordanMartinez commented 8 years ago

(Note that I didn't give you an algorithm how to compute b', a' from a, b. I just gave an example of what could be the result of the algorithm in one instance. I don't think a general algorithm with meaningful results is in any way trivial.)

Yeah... I can see how that can get complicated very quickly with more complex edits...

JordanMartinez commented 8 years ago

Overall, we're thinking of a structure like this:

selection_019

JordanMartinez commented 8 years ago

I created an DAG of undos one might have in an area. The question I have is, how would edits A-F "bubble up" to the top?

dag undofx - example

JordanMartinez commented 8 years ago

When b' and a' are applied, in this order, to the initial text, we get the same final text, some mediocre stuff. a' can now be undone, to get mediocre stuff, and we can somewhat meaningfully say that that state corresponds to applying b as if a was never applied.

Referring to your previous comment, we'll say the rules of "bubbling up" edits start with the following ideas:

Using the above photo as an example, edit A's insertion of "I" is still found at the final end of the text. Thus, A is "bubbly" because the edit graph could be changed so that edit A' (which inserts I before gave his milk away...) is the last edit in the graph.

Additionally, if A was bubbled up, then there is a portion of its edit that would otherwise be lost (like eggs). In this case, perhaps we could split edit A into two edits and apply them at different times. So, edit A1 (inserts like eggs) would take the place of edit A in the graph and edit A2 (inserts I) could then be bubbled up to the top of the graph and thus be mutually-independent from other edits.

Another possible approach would be to pass on the "lost" portion to another edit. For example, B could change to include some of A's edit, namely, "eggs". Thus, B''s edit would insert hate eggs and C(unchanged) would still change eggs to milk. However, this approach has it's own flaws: what would we do if F bubbled up? It also has some aspect of the final text (and) but not completely since the other portion, eggs, is changed by edit H. We can't possibly pass the lost portion to H since it changes eggs to omelettes. Additionally, passing that on to edit G will lead to a user's surprise ("I just wrote [content] and undoing it also removes something I didn't write!?").

Back to our previous example, edits B to E are not "bubbly" because their edits don't share anything with the final text. This is largely due to edit G since the edit adds content all at once instead of one at a time. If that edit was broken up into two edits (G1: h in have -> g as in gave; G2: and -> away and ate his), then E would be "bubbly" since it's change (te my -> ve your) would still appear in the final text. The below picture explains this better.

ufx - bubbly dag - edited

In this slightly altered example (where G is split into two edits), now edit D shows a different situation. If one tried to bubble it, what would happen? It does share milk with the final text, but only does so because it's really two edits in one: inserting my before and , too after C's milk. In such a case, perhaps "bubbling up" D should first split D into two edits (D1: inserts my before C's milk and D2: inserts , too after C's milk) and then "forward" the call to undoing C.

Thus, the graph from C to E would be:

After C applied: I hate milk
After D1 applied: I hate my milk
After D2 applied: I hate my milk, too
After E applied: I have your milk, too

So, the rules so far seem to be:

Finally, I wonder how many of these rules will be subjective to the context of the object being undone itself. For example, these are the rules necessary for undoing text, but if some other application uses them, will the same rules apply?

JordanMartinez commented 8 years ago

In this slightly altered example (where G is split into two edits), now edit D shows a different situation. If one tried to bubble it, what would happen? It does share milk with the final text, but only does so because it's really two edits in one: inserting my before and , too after C's milk. In such a case, perhaps "bubbling up" D should first split D into two edits (D1: inserts my before C's milk and D2: inserts , too after C's milk) and then "forward" the call to undoing C.

In situations where an edit inherits another edit (like D's necessity to inherit C's change), that edit should be split apart into two, and forward it's call to the original.

On second thought, this really doesn't make sense. If C was done by Area 1 and D done by Area 2 and D is bubbled up, why would the Area 2 undo the Area 1's undo? It shouldn't. However, if C is being bubbled up, then D could be split up into D1 and D2 so that C is now the free part.

JordanMartinez commented 8 years ago

Due to UndoManager's undoAvailableProperty and corresponding redoAvailableProperty, it seems the shared underlying NonLinearUndoManager (NLUM) would need a reference to a Proxy of the object being observed so that it could calculate whether an edit that is buried underneath other edits can be bubbled or not. In the case of StyledTextArea, its EditableStyledDocument would need to return a Proxy that gives the NLUM either the current text (e.g. for a plain text NLUM) or the rich text (e.g. for a rich text NLUM). This proxy would need to be "closed" when the final NonLinearUndoManagerProxy is closed.

To model a DAG, let's use a Map<Source, List<Change>> to track the changes a source has created and a Map<Change, List<Change> to keep track of the edges that point to (depend on) that change.

The process for undoing a change would be something like:

  1. Source calls undo
  2. It's NonLinearUndoManagerProxy forwards the request to the shared NLUM. If the source's undoAvailableProperty#get returns true, then that Source's list of changes has at least 1 mutually independent change or 1 change that can be bubbled up; thus, step 3 will follow. Otherwise, it returns false
  3. The NLUM uses the Source as a key in a Map to get the list of changes created by that source and which can be undone.
    • get the change that was created last and which can be undone / bubbled
    • if that change has dependencies, get the bubbled-up version of it (this action will also change the graph)
    • undo the change (which either already was mutually-independent or is now after the bubble-up process)
    • invalidate the NLUMProxies (all undo/redo available properties are re-computed)
    • return true

If an edit needs to be bubbled up, something like the following would need to occur:

  1. determine which part(s) (this would need to be specified via functional programming to account for different undo contexts) match up with the content returned from the Proxy of the observed object. In RichTextFX, it would be which parts of the plain / rich text of the document from change's start to change's end matches the change's plain / rich text.
    • If multiple parts of the change correspond to the content but are separated by a dependency (let's say A's edit was "I like milk" so that C never changes "milk" to "eggs." In this case, "I" and "milk" would both correspond to the final text but would be separated by B's change ("like -> "hate"), then A would be broken up into three changes (A1: "I", A2: "like", A3: "milk"). Via functional programming, a developer would control which of these two "bubbly" edits (A1 or A3) would be chosen.
    • If only one part of the change corresponds to the content and is not separated by a dependency (A1: "I", A2: "like eggs"), then A would be broken up into the free / mutually independent part (A1) and the part that has dependencies (A2). A1 in this case would be chosen.
  2. The graph would be updated
    1. Remove edit A from our list and it's mapping of its dependencies
    2. Add the unchosen edit(s) A2 (and A3 in multiple parts case) to the source's list at the position of the original unsplit edit (e.g. A) and the corresponding mapping of its dependencies (given key A2/A3, which other edits depend on it?)
    3. Add the chosen edit at the end of the source's list of changes and an empty list for its dependencies
    4. Return the chosen edit

One issue I can foresee is if one wants to implement a NLUM with a fixed size of changes. In the above example, one change was made into three. How should that be dealt with? If the limit and number of changes is at 100 and the bubbling process splits one change into 3, what happens to the other two?

To implement forgetHistory, one could clear out a Source's list of changes. To implement mark, one could clear out the changes from the first item in the list of changes to the current item.

JordanMartinez commented 8 years ago

I've thought more about the current position relativity issue.

Given a NonLinearChangeQueue (Queue 1) of 5 mutually-independent changes (1, 2, 3, 4, 5), the current position should point to 5. However, if another Queue (Queue 2) adds 6, then we need to recalculate whether 5 is still "valid." If it is found not to be valid, then the current position should go back in the queue one change and see if that change is now valid. This loop should continue until the current position points to a valid change or hits 0. In our example, the current position would point to 4 because it's still valid. However, if 6 is undone, then the current position of the Queue 1 should jump back to 5.

Similarly, there's another situation that can arise. Let's say Queue 2 adds 6 which invalidates Queue 1's 5 and Queue 3 adds 7 which invalidates Queue 1's 4. Let's also say that 6 and 7 are mutually independent from another. Thus, Queue 1's current position points to 3. There are a couple of ways this situation could further play itself out:

Thus, there are really 2 "current position" values that need to be kept track of: the "relative" current position (the one that changes when a change becomes invalid or valid again, the one used in hasNext and hasPrev) and the "true" current position (the one that points to the change(s) that should occur next but can't due to being invalidated. The "true" current position would be represented by a List since there can be multiple changes that should occur but which become valid under different circumstances as we saw from above.

JordanMartinez commented 8 years ago

Another issue I thought of this morning: the list of redoable changes for each ChangeQueue needs to be filtered each time a new change is done/redone to see which of those redoable changes are still valid.

JordanMartinez commented 8 years ago

Additionally, I realized another issue. When I try to close a ChangeQueue, what should be done about its changes in the DirectedAcyclicGraph's allChanges list? The issue lies in knowing which changes are "bubbly" and which aren't. If CQ1's change was modified by one or more of CQ2's changes, the one I'm trying to close, then CQ1's change should not be "bubbly". If I remove CQ2's change, then it's possible that it would appear to be "bubbly" and cause an error later on if undone. So, I see 2 ways to handle this issue:

Edit: 5/16/16: The second approach should be chosen for another reason: what happens if the underlying undoable content (or EditableStyledDocument in RichTextFX) is modified by something without an UndoManager attached to it? Unless there's an interface that that content implements, the DAG wouldn't have an accurate version of the content and thus wouldn't have an accurate list of valid undoable/redoable changes.

JordanMartinez commented 8 years ago

Another issue I thought of this morning: the list of redoable changes for each ChangeQueue needs to be filtered each time a new change is done/redone to see which of those redoable changes are still valid.

Due to this situation, it seems that even the redoable changes could also be "bubbled" (though it would be in the opposite direction when compared to the undoable changes).

JordanMartinez commented 8 years ago

Thinking about how to implement a fixed size non linear change queue...

"Fixed size" implies that there are only so many changes to keep in memory before casting out extra ones. However, what does one do when the queue has already reached its limit and a change is bubbled? Bubbling a change can work for both undos and redos.

Let's assume a queue has reached it's maximum size. If a redoable change is bubbled, then the extra change will propagate down to the queue's undo list. If an undoable change is bubbled, it is later added to the redo list. In either case, it isn't reasonable for the change that was just undone/redone to then be impossible to redo/undo because it was popped off. Rather, an item at the end of the list should be removed. For example:

|---- undo list ----|=|---- redo list ----|
xxxxxxxxxxxxxxxxxxxx = xxxxxxxxxxxxxxxxxxxx
^                    =                    ^

However, which list's last item should be popped off?

Thus, it seems wiser for the fixed size implementation to have two size limits: one for the undoable changes list and another for the redoable changes list. The issue here lies in deciding which list to use to determine when the size limit has been reached and thus when to pop off a change. Do we use the validChanges list, which is relative and whose length is unpredictable or the allChanges list, which is predictable, but not always relevant since it's not used to undo/redo changes?

Given a queue where

Then, we can consider a few situations.

The size limit could be calculated using both lists. For example, remove a change from the undo list if validChanges > x && allChanges > y.

Another possible approach is to not implement a fixed-size non-linear change queue in the first place.

JordanMartinez commented 8 years ago

@TomasMikula In light of my previous comment, I did want to clarify something. Is the fixed size queue comparable to a glass full of sand or a tunnel full of objects?

My previous comment assumes the second approach, but I wasn't sure if that's actually what you're fixed-size linear change queue does.

JordanMartinez commented 8 years ago

Also @TomasMikula, I'm not sure how / whether to implement the UndoPosition and ChangeQueue.Position interfaces for a non-linear approach (mostly because of the queue's relative nature). Any ideas?

JordanMartinez commented 8 years ago

In light of my previous comment, I did want to clarify something. Is the fixed size queue comparable to a glass full of sand or a tunnel full of objects?

I looked over your FixedSizeChangeQueue's test and it indicates it's the second approach.

I also looked over your current implementation of the UndoPosition and ChangeQueue.QueuePosition interfaces. Here's my understanding:

interface UndoManager {

    interface UndoPosition {
        /**
         * Sets the mark of the underlying UndoManager at this position,.
         * whether that position is valid or not.
         */
        void mark();

        /**
         * Checks if the UndoManager's current position 
         * points to the same change as this one
         */
        boolean isValid();
    }

    /*
      "Saves" the current position of the UndoManger
       which can be used to indicate whether the UndoManager
       points back to the same position as before or not.
     */
    UndoPosition getCurrentPosition();
}

and

interface QueuePosition<C> {
    /*
      Indicates whether the position refers to a change
        that is still within the queue. It is valid if the change 
        is either in the undo or redo list of changes, REGARDLESS
        OF WHETHER OR NOT THE CHANGE CAN BE UNDONE/REDONE
     */
    boolean isValid();
}
JordanMartinez commented 8 years ago

To make QueuePosition#isValid work in my PR, I had to use something similar to what a LinearChangeQueue does when it performs a change: store the changes's revision and later use that number in the change that is added to either a queue's undo/redo list. In this way, a queuePosition will be considered valid if it matches a change's revision whether that change was bubbled (and thus would be different than the original) or not.

JordanMartinez commented 8 years ago

Here's what I've concluded so far in trying to implement this:

Added the following on 6/4: