Closed reprogrammer closed 13 years ago
The above problem is due to a conflict editor save operation that is not preceded by a corresponding conflict editor open operation. It is not clear how this scenario can happen, but just checking the existence of the saved conflict editor and proceeding without saving it if it does not exist (see the above commit) helped to continue replaying the sequence of user operations recorded for participant cs-504.
The only other data anomaly was caused by 2 duplicated code edits: the same code change event was recorded twice. Again, it is not clear how this can happen since CodingTracker registers itself on a newly created text buffers and the above scenario would mean that either two buffers share the same document or the same buffer sends its creation notification twice. In both cases this would be an Eclipse bug. Maybe there are some other scenarios, but I have no idea how to reproduce this problem. Anyway, so far this is the only such case out of all recorded operations of all participants that I replayed. For this particular case just removing two duplicated user operations (with timestamps 1307782748997 and 1307782749001) helped to replay the sequence till the end. If we observe this more often, then instead of manual fixing we will implement a postprocessing step that would eliminate duplicated user operations.
@Wanderer777: Have you tested the scenario where the user opens a file in two editors? You can open a new editor from an existing editor by right clicking on the title of the the editor and selecting "New Editor".
@reprogrammer: Yes, I tested such scenario - it was one of the first scenarios that I used when I just started working on CodingTracker. There are different ways in which you can open several editors on the same file and with the exception of conflict (compare) editors, all text editors share the same document (that is why if you change the text in one editor, you can see your changes in all other editors of the same file), and there is a single buffer created for it regardless of the number of editors connected to this file. Conflict editors are different, because each has its own document, and CodingTracker uses a different mechanism to track changes to documents of conflict editors.
I found the same data anomaly in the recorded operation sequence for participant cs-506. The operations with timestamps 1308238981896, 1308238981899, 1308446718671, and 1308446718679 are duplicates that have to be removed for the correct operation sequence replaying. I still can not reproduce this problem, but it looks more and more as a possible Eclipse bug. In particular, duplicates with timestamps 1308446718671 and 1308446718679 belong to the same logical sub-sequence of code changes (adding a class that implements HttpSession
using Eclipse code generation), but only the first two operations were duplicated while the multiple following operations of this logical sub-sequence of code changes were recorded correctly. Note that this whole logical sub-sequence of code changes is a result of the same "add inner class" high level action, which is performed by Eclipse automatically. This suggests that there is a single document listener, which receives each of the first two event notifications twice (which is a bug), while the following event notifications are received correctly (i.e. just once).
So far, there are very few instances of this problem in the recorded sequences and they were relatively easy to detect and fix. If this problem becomes more pervasive, we would need to implement an automatic fixer as a postprocessing step.
@Wanderer777: We're going to analyze the current data. And, our analysis depends on CodingTracker log files, too. If you think some problems in CodingTracker log files need to be manually fixed or you don't have automated fixes for them, we'll ask you to store the fixed log files into our internal SVN repository.
@reprogrammer: Sure, I can do that. But the problem is that our participants will keep uploading new data. And to analyze the most recent data, we will have to fix the same operations again. Then again, again, and again for the following uploads. So, either you will be restricted to the version of data that I fixed and stored in our SVN repository, or I will have to repeat the manual fixes over and over again, which does not sound right to me.
@Wanderer777: As you suggested before, I think it's a good idea to automate the fix if you think the overhead of manual fixing is high.
@reprogrammer: In this particular case the overhead of manual fixing is very low if I do it only once - for the final upload of the recorded operations (i.e. at the end of the study for these two participants, or when they switch to a new CodingSpectator version and thus, start new sequences).
@Wanderer777: We'd like to analyze the data continuously because finding a problem at the end of the study will be too late.
@reprogrammer: The duplication problem is solved by the above commit, so postprocessing is not required any more. But lets keep this issue open for a while since it contains information on how to manually fix 2 operation sequences that contain duplicates.
@Wanderer777: Postprocessing is still required for the existing data, isn't it?
@reprogrammer: There are just a couple of instances of this problem in the already collected data that I am aware of. And I think this is too little to justify an automated fix.
This issue is fixed. The required manual fixes in the operation sequences that were recorded before this fix are reported in issue #263.
The replayer of CodingTracker fails with the following exception on "cs-504/b7709c12-ba60-4d6b-ad92-f370765d6fce/1.0.0.201105300951/codingtracker/codechanges.txt".