Open rdstern opened 5 years ago
I like this ideas, I was trying to obtain prediction intervals using R-Instat but it was not possible(maybe). This is for modelling course at AIMS . I did the same thing in R and i was able to obtain my results.
predict(fit1,data.frame(Gestation = 27),interval = 'predict')
. The results i was trying to replicate are on page 17 of this lecture notes 05_Saturday_Hypothesis_Testing.pdf and attached is the dataset.
protein.txt
And there is a package called prediction, that claims to make this all easier. Can you get the prediction intervals (i.e. not the usual confidence intervals) just as easily with the prediction package?
It is exciting that the Prediction keyboard is there in the new Use Model dialogue. I have been trying it, and it seems very close! I tried following the example in the prediction manual. a) It starts with datasets and the iris data, which we have. So I opened that. b) I used the Model dialogue to fit the same model as in the guide, i.e. lm(Petal.Width ~ Sepal.Length Sepal.Width Species) saved into model1 c) I then used prediction(model1) and it gives an error. d) So I used lm(Petal.Width ~ Sepal.Length Sepal.Width Species, data = iris) (as in the guide), so I explicitely included the data = iris in the lm command. Saved in model2 e) Now prediction(model2) does not give an error - but it doesn't give any output either. I am not sure what I expected - I think the predictions in the output window for every data point. (These are the fitted values) f) Then prediction(x, iris[1,]). Runs fine, but also no output. g) Then prediction(x, at = list(Species = c("setosa", "virginica"))) and it does give output. Whoopee! h) Then prediction(x, at = lapply(iris, mean_or_mode)). This needs predition:: in front of the mean_or_mode function, but also works and with output.
Then onto the next example - with a new package called mlogit. The package installs and the data set is then available. Brilliant! Then the commands work to fit the model.
But then the line with prediction says:
Error in tmp[["fit"]] : subscript out of bounds
The error occurred in attempting to run the following R command(s):
.temp_val <- capture.output(prediction::prediction(mod)) OK
This all seems very close - and very exciting to be able to get this far in R-Instat. I hope this might work even better in time for Rwanda.
Could someone in AMI also add the memory of models used in the past - as in the model dialogue, and possibly even a Try field?
@dannyparsons or @maxwellfundi the prediction package is used by the Model > Use Model dialogue, but I think is not yet included in the set. Please could it be.
@Ivanluv and @dannyparsons please could the Model > Use Model dialogue be enhanced. For @Ivanluv please could you repeat the "tricks" you did on the Model > Hypothesis Tests dialogue. This is to add the same features you did on that dialogue, namely: a) Add the show arguments checkbox b) Add the try control c) Add the same Help button as the others. It shows the whole package and that is very useful. d) Add the facility in the Expression control to remember the previous expressions. e) Add a Save Result checkbox, though I am not sure if that will be simple?
@dannyparsons what can be added to save the results? For example, I have been trying the dialogue ready for extremes work. The erlevd function returns a vector. I assume we can save them all as simply another object?
I have been fitting an extreme value distribution, using the Model > Modelling dialogue. So last_model is the fitted values. Can try with a sample dataset from the extRemes package, e.g. flood. Once fitted, there are 3 methods described, namely plot print summary
print and summary are both available on the Model > Use Model dialogue wiuth the general keyboard. When I try plot it doesn't give an error and the first time it quickly seems to give the plot, but it then disappears. So it isn't getting to the output window. Can this be done?
I wondered whether it might appear if it was a ggplot, so I tried ggplotify.
I found that ggplotify::base2grob(~plot(last_model)) works - or at least doesn't give an error. Then: ggplotify::as.ggplot(ggplotify::base2grob(~plot(last_model)))
perhaps gives a ggplot, but I still don't get a plot.
Let's separate out some of these to separate issues.
I am re-opening this issue briefly. It is great it is merged, but please could @maxwellfundi or @Ivanluv quickly change the size of the dialogue. a) It is now much wider than need be. b) It is also a bit "shorter" so the bottom line of buttons are only half-visible.
I suggest this is important. I indicate this through showing the "system" in Genstat. I think we can do even better! Here is the dialogue for fitting a model. This is like our Model > General > Fit or Model > Model
Notice that Predict is not enabled. I now run the model. This is like our Fit and this enables the Predict button. So I press Predict to give the following sub-dialogue:
In the above dialogue I changed the levels of the fertiliser to 0,1,2,3 and then it gave the following results in the output window:
I had asked, so it also saved the (same) results in 2 data frames. It has a table structure for data frames, so it saved them separately. Here is one of them:
I could also do a graph, and when I click on the options in Predict I get as follows: This gives the following graph:
Note that in Genstat you have access to all the columns in the data frame. So what happens if you a) Don't use all the variables in the model. Here is the result ignoring the variate:
And here ignoring that factor:
These are like the margins in the 2-way table given above. Those were "Marginal weights", i.e. using the observed frequencies of the 3 varieties.
These are equal weights, i.e. weighting each variety equally.
b) There is nothing in Genstat to stop you using predict with variables that are not in the model. However, then it just gives a warning that you have used terms that are not in the model, and doesn't give you anything. We can do better, because we could just include the terms that are in the model - from the corresponding object.
I claim this is all very important for us in R-Instat. It sort of "completes" the model component.
In "simple (descriptive) statistics" we use the describe menu to prepare informative tables and graphs. They are direct summaries of the data.
Then we move to modelling (statistical inference). We fit models and the "recent" advances are on the range of models that we can fit with a common framework.
Once we have a suitable model we then (should) want to prepare the corresponding tables and graphs using the chosen model. That's statistics! And (see above) the predict feature is the way we do this. So in an improved (more advanced?) version of our statistical problem-solving course we could include this idea!
We have some decisions to make for R-Instat. I suggest we use the prediction package where possible. Here are some possibilities - not mutually exclusive:
The structure of the prediction data frame will need careful discussion and planning.