IDEMSInternational / R-Instat

A statistics software package powered by R
http://r-instat.org/
GNU General Public License v3.0
38 stars 103 forks source link

Improving R-Instat for random sampling? #5100

Open rdstern opened 5 years ago

rdstern commented 5 years ago

I use the rice survey to discuss how we might have random sampling made easier in R-Instat?

The essence is the sample command in R which we already have in Prepare > Column: Reshape > Random subset.

My problem was to sample first the villages out of 10. This was either equally likely or probability proportional to size. I took 4 as follows (in the script window - but could alternatively have been in the File > New dialogue.) villages <-sample(1:10,4,prob=c(20,10,8,30,11,24,18,21,6,12)/160) sort(villages) Then - a bit manually - I used the File > New Data Frame with the following commands:

vill=rep(c(4,7,8,9),times=c(6,6,6,6)) v4 <- sample(30, 6) v7 <- sample(18,6) v8 <- sample(21,6) v9 <- sample(6,6) fld <- c(v4,v7,v8,v9) data.frame(village=as.factor(vill),field=fld)

I was really pleased that this simple use of R - in R-Instat (with the script window and the new File > New Data Frame - produced a sensible data frame in R-Instat. It gives rise to a number of questions:

a) Could the R code be improved for this task? Is this a good example, or should we have told ourselves to go to RStudio for this. (I think - of course - that this simple task fits OK in R-Instat! b) Could there be a sensible dialogue for this sort of thing - needs careful thought. We could, at least, include sample and rep in the examples in File > New Data Frame and also consider this sort of example in the help file. c) (In the future) could the grid be improved so we could then suggest the data be entered into the grid?

shadrackkibet commented 5 years ago

Good that this has been reported here. When doing my assignment i tried random sampling in R-Instat but i could't do it "simply" so i resorted back to doing it in R.