IDEMSInternational / R-Instat

A statistics software package powered by R
http://r-instat.org/
GNU General Public License v3.0
38 stars 103 forks source link

Adding an undo to R-Instat? #4185

Open rdstern opened 6 years ago

rdstern commented 6 years ago

With ordinary R you don't need an undo, because you are running code. So you can easily go back and then not run the last commands. When I discuss R-Instat I am often asked about undo. If we did want to add it, then how should it work? I assume it would all be in R.

In Genstat they say - just for sheets (data frames):

Undo Reverses the last edit operation on a spreadsheet. An undo list is maintained for each open spreadsheet. You can set the maximum number of undo operations using the File tab on the spreadsheet options. Redo Reverses the last Undo operation. By default, this option is disabled, however, you can activate this by selecting the Enable Redo option on the General tab of the spreadsheet options. Note that activating the Redo facility can considerably increase the memory requirements if using large spreadsheets.

In Genstat I rather like where it says not simply Undo but it changes to say Undo Edit Column.
a) If I change a cell in the sheet it says Undo cell. It changes one cell back at a time. b) If I paste from Excel into the sheet, then it says Undo Paste.
c) If I change numeric to a factor it says Undo Conv Col. d) If I use calculate to produce a new column it says Undo Insert Column.
e) If I use calculate and change Yield, but put the results back in Yield (i.e. same column) it says Undo Add Data.
f) If I sort on Yield it says Undo Sort. g) If I do a summary to a new sheet it doesn't have undo there, or on the original sheet. h) If I return to the original sheet, then it does have the undo and redo there.

We could then go through the different dialogues and assess: 1) Which ones need an undo. For example many simply produce output or a graph. They don't need an undo. We are largely left with the prepare menu. 2a) Some (like calculate) usually add a column in the same data frame. We could ignore that. Or undo would simply delete that new column. 2b) On the other hand we can (in calculate) give the same name as an existing column. Then it changes the column and undo would replace it by the old one again. 3a) Other dialogue write to a new data frame. We can undo by simply deleting that data frame. (Whether we make that deleting a part of undo or we assume we can do this ourselves can be part of the discussion.) Genstat only has undo/redo on each data frame in turn.
3b) If we use the same summary dialogue twice - in the same way - it may overwrite the previous columns it produced last time. Undo could then restore those columns - or we just use the version that is correct. (Like Genstat we could ignore all these initially?) 4) Dialogues like sort change a whole data frame. (Are there others?) With sort, the undo could copy the data frame or simply copy a variable with the original order. 5) Many dialogues change the metadata (like rename) and most of the columns that change the type of the data. They can perhaps be restored by keeping the old names, etc. 6) And (of course) we can change values in the grid. Currently this is just single values, but we may get more ambitious. The old values could be "remembered" in a new hidden data frame perhaps. Perhaps this could be like the comments dataframe? Or maybe we just remember these values - like cell and paste in Genstat - either in R, or in the front end.

It doesn't appear so difficult to get to a simple undo - at least going back one dialogue or one edit.

And we would (at last) have some use for the Edit menu!

rdstern commented 3 years ago

I describe here a very simple possible Undo. It will be easy to add to it, but I would propose we remain very simple, i.e. just one undo (and possibly its redo partner) for a single operation on only a single dataframe. And possibly only for reasonably small data frames.

But I wonder if it would also be very easy to have a separate Return feature. This becomes enabled when we do a backup. If implemented it simply returns to that set of data frames. To avoid any complications you return simply to the last backup data book? (Addition in 2024 - we now have this!)

Now here is a small edit on the undo possibility from before: After @shadrackkibet work on renaming multiple columns I am back to wondering about an undo in R-Instat - like Genstat offers.

The Rename is an example where we are changing the data frame and may make a mistake.

So, simply make a copy of the data frame before making any changes to it. Then we could swap it back, if need be, i.e. if we went to the Edit Menu, there would now be an entry that says Undo: Rename. Usually (when there is no backup data frame) the Undo is disabled.

I don't want R-Instat to become slower. So this is only for selected dialogs and probably also only for small data frames. We need to define "small" and for me it can be as small as you like!

There are only a few dialogues that do this. Those are the ones when we make the backup data frame. We also only have one instance of this.
a) Rename single or multiple columns b) Changing the type of an existing variable. c) Adding a new variable - possibly only where we choose to overtype an existing one!
d) Deleting variables e) Deleting rows f) A recent addition is our copy/paste facility that is becoming available g) There is also edit cell and delete cell(s) recently. h) And there is the climatic Edit dialog.

What else? Of course a big one is deleting data frames! I suggest that's different and would be covered by the proposed Return, or Restore, where - when we have saved the current situation - we can return. (If that existed we could perhaps add an extra warning in a situation where there is no backup set?)

(2024 update - this is now covered, in my view, by our working - I hope - big undo, that proceeds via our regular backing up process.)

rdstern commented 2 years ago

This relates to discussion #7130 and more generally on ease of use of R-Instat, where we score poorly in the review. Another category where undo is useful is when we are making changes in reogrid, perhaps when entering a set of data. It could then say "undo grid entry", or "undo grid change". Perhaps undo data frame entry or data frame change would be clearer.

The only downside mentioned has been the time and memory implications of having another copy (perhaps multiple copies) of a data frame. This could be left for now, but there could be a check if needed that undo is only for data frames that are not too large. The command object.size(ghana) gave 6628000 bytes, etc for quite a large data frame, so that's interesting and easy.

Patowhiz commented 2 years ago

I think this somehow relates to what @ChrisMarsh82 is doing in regards to saving R-Instat session. After his work this would probably be easily achievable.

shadrackkibet commented 2 years ago

My idea as follows;

This could be done at databook level. We need to have some sort of a stack data structure that will have options depending on the number of retrievals we can have - this is like saving copies of data frames - we can decide to save specific elements that change in the stack eg just column names. However, this needs careful thought because it will have implications on the size of the databook which can slow things. So having options like say only one undo might just be sufficient for now.

rdstern commented 2 years ago

@Patowhiz I was after something very simple, that would only work, perhaps for one data frame and just one command back. It just adapts what you did at the end of a command once. It also doesn't add to the databook, but is stored temperarily somewhere - as we currently do with graphs. All it would do is on any action - grid or dialogue where needed, is to take a copy of the data frame. (assuming not too large). It stores it together with the name of the data frame and the command/dialogue. So, in the edit menu it would then appear not disabled as undo_grid or undo_type, (if because of a change to numeric). There might be a tooltip which would say Undo as change to xxx dataframe.

There are also not many situations where we need this. Most commands add columns, or add data frames, etc. I suggest that (at least initially) those dialogues that initially add a new variable - but you could overwrite, you might still be "on-your-own".

But delete rows, delete variables, change type, etc as well as editing data in the dataframe would be included.