africanmathsinitiative / R-Instat-Help

The latest uncompiled (.hnd) and compiled (.cmd) versions of the help file for the statistics software package R-Instat
https://chuffed.org/project/africandatainitiative
GNU General Public License v3.0
2 stars 5 forks source link

Guide and materials on teaching R through R-Instat #19

Open rdstern opened 6 years ago

rdstern commented 6 years ago

With the forthcoming course to AIMS students in Cameroon one aspect is that these students will later proceed to using R with RStudio. We wonder whether the use of R-Instat can be used to also introduce R itself.

There are different aspects. here I introduce one small possible item. The Climatic > Prepare > Transform dialogue makes it easy to get running totals or means.

Here is the code in the script window:

Code generated by the dialog, Transform

grouping <- instat_calculation$new(type="by", calculated_from=list("Guinee2"="year","Guinee2"="Station")) transform_calculation <- instat_calculation$new(type="calculation", function_exp="zoo::rollapply(Rain, width=3, FUN=mean, fill=NA, align='right')", result_name="moving_mean", manipulations=list(grouping), save=2) InstatDataObject$run_instat_calculation(calc=transform_calculation, display=FALSE) rm(list=c("grouping", "transform_calculation"))

Currently this is OK for the mean, but not the median. Changing the code to the median is as follows:

Code generated by the dialog, Transform

grouping <- instat_calculation$new(type="by", calculated_from=list("Guinee2"="year","Guinee2"="Station")) transform_calculation <- instat_calculation$new(type="calculation", function_exp="zoo::rollapply(Rain, width=3, FUN=median, fill=NA, align='right')", result_name="moving_med", manipulations=list(grouping), save=2) InstatDataObject$run_instat_calculation(calc=transform_calculation, display=FALSE) rm(list=c("grouping", "transform_calculation"))

More ambitiously we calculate a function and use that: Code generated by the dialog, Transform

grouping <- instat_calculation$new(type="by", calculated_from=list("Guinee2"="year","Guinee2"="Station")) diff <- function(x) {mean(x)-median(x)} transform_calculation <- instat_calculation$new(type="calculation", function_exp="zoo::rollapply(Rain, width=3, FUN=diff, fill=NA, align='right')", result_name="moving_diff", manipulations=list(grouping), save=2) InstatDataObject$run_instat_calculation(calc=transform_calculation, display=FALSE) rm(list=c("grouping", "transform_calculation"))

I wonder about a more ambitious change and also how this might be done using RStudio. I am assuming this simple sort of edit might be useful in the script window and we would do other more ambitious analyses using RStudio.

One aspect is that it is easy (I think) to move from R-Instat to RStudio, but not so obvious to me that this is so useful when the edit is small and then we want to continue with R-Instat. So prepare sort of stuff might often stay in R-Instat, while Describe and Model stuff would be helped by migration.

rdstern commented 6 years ago

This follow-up is different. I found this blog very useful and it complements what we will try to do through R-Instat. In parallel with using R-Instat it would be good if students learned to use R directly. Here are 5 ways that are suggested. I propose to investigate these further, with the idea that students could (to some extent) choose for themselves.

These 5 ways are: a) YouTube videos b) Blogs c) Online courses d) Books e) Experiment

Within each approach is a small list of suggestions.

Different people might use a different mix of these approaches - options by context. And, of course, they are not mutually exclusive.

dannyparsons commented 6 years ago

Making small edits in the script window is something I think the AIMS students can cope with if needed. Doing anything more with that example, like moving to RStudio I wouldn't be suggesting because it's not general R code so won't help them learn R or will confuse them about R code. There's no output from running a calculation either so you then need to know how the whole instat object works to even see the result in the data.

If it was a graph that's different because the code will be almost standard ggplot2 code and there I can see that moving to RStudio is useful as there's lots you can do with the script and it's all useful to know in general.

In workshops I have imported data, done and graph and then moved the log to RStudio. Removing the instat object lines which aren't needed you end up with a nice R script very quickly and people could understand that R code and saw how easy it was to generate. The data manipulation is more tricky because we have our own functions wrapped around dplyr code so there's not much you can really learn from R-Instat that would be useful in writing your own R.

rdstern commented 6 years ago

Thanks. That's very useful. I think I am now getting there with ideas of the materials to produce.

This is currently pretty badly written, but I think has some of the key ideas.

David phrased it sort of that a statistician-type (or someone who wants to solve statistical problems could usefully have mastered three tools, namely a spreadsheet, a statistical package and a statistical language. This is the supposition that we make here.

Not everyone needs all three components. Some problems can be solved with just a spreadsheet and some users are just familiar with Excel for their statistical work.

Many people use a statistical package, but not the language. This may not be related to R. For example any of SPSS, SAS or Stata can be used simply as a menu-driven package. If then more is needed, then each can also be used in command mode, i.e. through their language.

In the past AIMS students have used R - the language - for their statistical work. This year we are adding R-Instat. This is a package, that uses R commands (behind the scenes). Later in the year we expect students to migrate, so they use the R language. We expect that some will then find they no-longer need R-Instat, while others may wish to continue with both tools. This is with R-Instat as the package, and with RStudio (which makes the R language pretty easy) as the environment for the R language.

To help you to migrate, we will also introduce some use of R commands while using R-Instat. This is partly through this guide, through a series of videos and also by the practice you make.

More to come

dannyparsons commented 6 years ago

On your list of resources to learn R, I like it in principle, but you can imagine students might then spend forever watching YouTube tutorials and reading blogs about R code and never actually getting to point e) or if they did not really knowing how they should "experiment" if they're less used to self guided stuff. I think I would present one or two concrete starting points for them: one or two books or online courses, where they will start using R code straight away and then suggest other resources to compliment this like the tutorials and blogs. I would also ask Sam for his book recommendations, I remember he has a few favourites and good to know how he compares them e.g. to the R for Data Science book.