Closed fpw closed 7 months ago
@fpw Thank you for reporting this. I'm able to reproduce the issue.
First of all, I'm happy that this issue is reproducible, so I can clearly track it down, and I'm curious what I will find. If we are lucky, we'll find the root cause of #713, and if not (which I expect), there is just something wrong with undoability of Addressables like RAM. But there seem to be multiple issues here at work, and they are clearly tied to the Undo system. We'll see..
I think we can rule out asynchronous behaviour as root of these kind of problems, because AFAIK everything in the editor runs off the AWT thread, strictly in a synchronous manner. But the core design of the Undo system can lead to the way you've perceived the behaviour. Let me explain.
The Undo system is based on Command objects. Every change on the circuit state must be treated by a Command. A Command has an execute() method. UndoableCommand has also an undo() method. Because it is straightforward to implement execute(), but often hard or even impossible to implement undo() for certain operations (especially those that deal with wire connections), the Undo system features a "Snapshot" concept. Snapshots are in-heap copies of the entire circuit. When the user performs an undo operation, the Undo system checks whether the last Command in the stack is undoable. If so, it is undone. If not, the Undo system takes the last snapshot and re-plays (executes) all its Commands, except the last one which is to be undone. In order to limit the time to replay, the number of Commands per Snapshot can be configured in the user settings. If that number is exceeded, a new Snapshot is created.
While this approach may seem elegant and convenient for the developer, it is also error prone, even fragile, and potentially dangerous in terms of corrupt user data. The following aspects can break the system and lead to the symptom in #713.
So, what you perceive as asynchronous or "random" behaviour, is probably produced by the moment when snapshots are created, or the system replays from a snapshot. These moments depend on what operations you use, and the configured number of Commands per snapshot, and when you save the circuit (which drops the Snapshots).
I'll proceed as follows:
Thanks a lot for the in-depth explanation! With that in mind, I will do a few more experiments in the hope to find a reproducible way to cause the wiring error.
BTW: If you are working on the development system, you could enable GraphViewConsistencyCheck in GraphModuleJVM.initialize() by removing the if() statement. This will check for the symptom in #713 with every executed Command, highlight the location of the issue in the circuit (if I remember correctly), and popup a Dialog.
The problem was indeed in the RAM/ROM components and their various UI representations. They didn't work correctly together with the undo/redo system.
The related, but separate issue of the poisoned undo stack even when opening other circuits will be treated in a separate investigation.
This might be related to #713 - I have a feeling that when this error is shown, Antares reverts the state back to some previous state. And that state seems to be asynchronously recorded because the number of gates that are still there depend on my placing speed or some other parameter I didn't understand - sometimes, both the RAM and the AND gate disappear and sometimes one of them is still there. If the state that Antares reverts back to is really recorded asynchronously, maybe it's possible that there's a race condition where the state is recorded between model and view changes, e.g. if a wire is placed and the view is updated before the model, is it possible that the state was saved during this so that the recorded state doesn't contain the model connection, even though the view connection is already there? In that case, reverting to this state would cause #713 .
Another possibility I thought of - just thinking aloud in the hope to trigger an inspiration for a fix - Since the undo stack is now in a situation where some action from a previous circuit is also present in a new circuit, is it possible that the undo stack is not properly cleaned between circuits? Maybe it's possible that something like this happens: Delete connection between vertex 2 and 3 in one circuit, hit undo in a new circuit and suddenly the connection between vertex 2 and 3 in the new circuit disappears. This doesn't happen under normal circumstances because the Undo stack seems to be cleared, but the fact that the error described in this circuit isn't cleaned up when switching circuits, maybe there's a possibility for something like this to happen.
Error dump: