IDEMSInternational / R-Instat

A statistics software package powered by R
http://r-instat.org/
GNU General Public License v3.0
38 stars 102 forks source link

Giving R commands in R-Instat - options by context! #4445

Open rdstern opened 6 years ago

rdstern commented 6 years ago
  1. We have always felt that users should be encouraged to migrate from R-Instat to RStudio when they need to give commands to R, rather than just using the dialogues. This route continues and we should also continue to check that it is as simple and encouraging as possible.

  2. An alternative is to give R commands within R-Instat. David and Danny have always been reticent about this options, and for good reasons. They include: a) We don't have good error trapping - we just throw the commands at R. So if they work, that's great, but if not, then the messages are usually difficult to use to make the necessary corrections. b) There is nothing to stop us giving commands that mess the user up completely. In particular commands like q() would throw us out of R-Instat - unceremoniously, and other commands may interact with our work and could result in it becoming a mess. (We should investigate that to advise on how to minimise the risk. c) The sets of commands look messier than pure R (and pure R can be messy enough!) In particular some sets of commands could be totally separate to our Instat object (they will be pure(ish) R, but why use R-Instat then - I explain why below!) Or they make use of our Instat object and then they do look rather complicated.

Despite these issues I claim that this feature in R-Instat is so important that we keep working on it, to make this route as acceptable as possible. Here is why, through 2 examples:

  1. Climatic audiences: They will love R-Instat. But there are so many more analyses in R packages (some well and some poorly written) that will take ages to even consider adding to R-Instat. In addition, there is a risk that adding many of them will clutter R-Instat and make it more complicated to use for the routine climatic work. When these users are ready, they have access to everything from RStudio, but there will be plenty who just want one or two specific analyses. And these may well be different for each Met service. While they wait we could have a set of commands that could be provided (at least temprarily, and then executed from the script window. If it is modest, then we could do it from R-Instat, and once it gets big, then we either recommend the RStudio route more strongly, or (in the other direction) check whether we could add a new dialogue.
  2. Agricultural Research audiences I still think we will be providing quite a lot that is useful within the R-Instat menu structure. And this should be particularly following the forthcoming GCRF work. However (for example) Ric stated that his reticence was because of the following: "a list (of dialogues in R-Instat) is always a highly filtered, selective and personal set of choices by R-Instat designers, and that will always be a bottleneck in making access to methods. My list of priority topics is not the same as yours, and I am sure that Carlos, Sam or others who spend time with these scientists would have lists very different from mine. One of the great things about R is the way that it grows almost daily, and my priority always will be to help people get started with tools that are right for their problems, which are also changing almost daily!" I am not sure I even agree with this. But Ric's point is that an analysis could then easily be suggested in RStudio. (Presumably often by providing the code for users to run.) I think that is likely to be easier than providing it in R-Instat. But: a) Having the R-Instat option may be more comfortable for some users - options by context no less! b) Sometimes part of an analysis could be done interactively in R-Instat. So what is provided may actually be simpler, and the suggested analysis becomes less of a black box. Ric also says: "I can certainly see the virtues of R-Instat for basic stats teaching in universities. As you know, I stopped doing that many years ago and have not tried to keep up with developments in thinking." This is a different point, but quite interesting too. I found that clients in agricultural research also needed basic stats teaching. It isn't just Universities. And I claim that R-Instat is not a bad tool here.

I digress, but my main point is that where we are getting to, on giving commands within R-Instat is potentially very valuable. We have given the process a bit of priority recently, with Maxwell and Danny's work on the script window and Shadrack's work on the Hypothesis testing dialogue. I would like to continue with this, both through added facilities in the script window, e.g. issues #4444 and #4442, checking what is possible (or not) through the hypothesis testing dialogue. And also possibly enhancing the File > New dialogue (if that is easy), see #4439, etc.

And also getting more of the AMI and other teams to try using these facilities.

rdstern commented 6 years ago

I now have a specific example for the features above for my climatic work. We have the dialogue to produce extremes (Climatic > Prepare > Extremes). These have then to be analysed and the package called extRemes is the obvious one to use - and already part of R-Instat. But there are no dialogues yet, so how should we use it? I tried first the code here:

z <- extRemes::revd(100, loc=20, scale=0.5, shape=-0.2) fit <- extRemes::fevd(z) fit plot(fit) plot(fit, "trace") extRemes::return.level(fit,do.ci=TRUE)

This is from 2 of their examples. It is just as they had it, but I had to add the extRemes:: where needed.

This was fine, but not "my data".

  1. So I took the Dodoma data from the library, then used Prepare > Column: Reshape > Summarise to get the max of rainfall on an annual basis. Then I typed extRemes::fevd(max_Rain) into the Model > Hypothesis dialogue. In the dialogue, I saved the results into fit to tie with the code above. Then I wanted to use the saved object, i.e. do the plots, etc in the script window as below:

plot(fit) plot(fit, "trace") extRemes::return.level(fit,do.ci=TRUE)

Of course this doesn't work, because it doesn't recognise the object fit (which is in the Instat object). a) How do I continue, i.e. start with the fitting as above and then get the script to recognise the resulting object. b) What I therefore did instead was to copy the code generated by the hypothesis test dialogue, i.e.

Dodoma_by_Year <- InstatDataObject$get_data_frame(data_name="Dodoma_by_Year") attach(what=Dodoma_by_Year) fit <- extRemes::fevd(max_Rain)

This is just up to the fit line in the code above. This was run in the script window. Then the remaining lines, shown above, run just fine. Is this what I should be doing? I now see from here how I can get the script file to recognise the data from the R-Instat object. Would that also work for it to recognise the objects?

rdstern commented 6 years ago

I think that Maxwell is continuing to enhance the right-click for the script window. This is to confirm two additions: a) Could we have an option to run the current line. (And then to move to the next line.) This could also have as a short-cut, which is then consistent with RStudio. (And running a single line would not give the warning. b) Could we also have a short-cut to run a selection? c) Maxwell tells me that adding Undo and Redo is also easy here as the script is just a text window. Great if that could also be added.

dannyparsons commented 6 years ago

Also to check that the Script window is now using the built in cut/copy/paste functions of the textbox and not our own custom functions? They should also be the first three items on the right click.

maxwellfundi commented 6 years ago

@rdstern We have the feature of running selected text. Do you mean we could now be able to run a line at a time by clicking on run and the cursir moves to the next line like RStudio?

@dannyparsons yes, the copy, cut and paste use the inbuilt functions not our own functions. I will also move them to the top of the list. I purpose to have this for the next version.

rdstern commented 6 years ago

I answer to We have the feature of running selected text. Do you mean we could now be able to run a line at a time by clicking on run and the cursir moves to the next line like RStudio?

Absolutely. And (also like RStudio) is a shortcut key for that. And it does not warn when running a line at a time.

rdstern commented 6 years ago

Lily's MSc project is (sort of) on adding a survival menu to R-Instat. She is responding to information from Jim Todd who would like to use R-Instat in his teaching in Tanzania and possibly elsewhere. Jim writes:

"My excitement with R-Instat is as an introduction to analysis using R. I think the menus allow students to try different analyses and to see the R-commands needed for those analyses. This will enable them to build up their range of commands and expertise in using them. This is quite straight forward for continuous and binary outcomes, as these are easy to see in many different scenarios. However I think it is a little more complex for survival analysis, especially with observational cohorts, which are common in epidemiological studies of human health."

From mid-October we are also teaching AIMS Cameroon students. They will go on to using R - with RStudio, but will use R-Instat in our first and also perhaps in a second course in statistics, which is taught by Jane Hutton from the University of Warwick.

So I would like to consider how we could use R-Instat in this way, i.e. to make it easier for users to then transition to R. I don't want to distract from being able to use R-Instat for the main objectives, i.e. teaching statistics. If we were simply teaching R, then we should probably start directly with RStudio and ignore R-Instat. But if (for good reasons) we are using R-Instat itself, and then assume that later the students will migrate to Rstudio, how can we make that combined process an easy one?

Here are some initial suggestions: 1) When this is an objective, then students should install RStudio from the start, i.e. when R-Instat is installed. 2) We document and explain early on about Instat-objects. This links also to what can be saved and used in R-Instat or exported and then opened in a separate session in RStudio. 3) We discuss objects, and most are linked to a data-frame and hence are saved as a part of the Instat-Object. These include filters, graphs and models. (What others are there? I think calculations are not yet objects? Will tables be objects?) 4) Can we colour code the R commands, so that the special parts that are R-Instat (at least reading from the Instat object and writing back to it) are different to the middle part? 5) We can use the calculator to show how single commands are given. 6) Similarly with the Model > Hypothesis Tests dialogue. It would be really good to add loops here, (which are sort of in the dialogue, but not yet implemented, and also to add some keys (e.g. at the same time as we add the extRemes package) that use objects for further outputs. 7) It is great (and important) that we have the To Script button on each dialogue. We should add some examples that make use of this feature a) to look at a command, b) to make a small change, c) to use 2 or 3 together, d) to pass through to RStudio. 8) The issue #4386 is on giving a by command. Danny has some neat stuff here. Can we include a bit of this? 9) I would still like to see the Wakefield stuff added that enhances the File > New, Prepare > Column: Calculate > Calculator and Prepare > Column:Generate > Enter in ways that link well with giving commands, see #4439 and #4347 which are already down for Version 5.1. 10) The issue #4438 might give some further ideas. 11) What else?