IDEMSInternational / R-Instat

A statistics software package powered by R
http://r-instat.org/
GNU General Public License v3.0
38 stars 103 forks source link

Improvements and tooltips for the calculator summary keyboard #5551

Closed rdstern closed 4 years ago

rdstern commented 5 years ago

In the calculator summary keyboard the layout seems ok. Please could we get rid of the unnecessary capitals on Propn, First, Last and Mode. They can stay on IQR and anyDup which corresponds to the R functions.

More major and perhaps partly a @dannyparsons question is that 5 of the functions seem to be "our own" functions, namely cv, mad, mode, mc and skew. To be able to describe them I need to know where they have come from. They are functions like: summary_median_absolute_deviation() for the mad key and summary_coeff_var for the cv key.

a) For some of them, why do we have our own function? There is a mad function in the stats package for example. b) More serious is that when we use our own function - if they are our own, then they don't give an option I could find for missing values. All the standard summary functions have an option na.rm = FALSE or TRUE. These don't have any such argument it seems. c) And a small point is that perhaps we could have shorter names, like mad. Then the tooltip and the help can explain what thy are? d) So let me go through them in turn. cv I think we may use the sjstats function? I would slightly prefer the version in the raster package, that gives it as a percentage. mad lets use the stats package one. mc is, I assume the robustbase one, so lets use that directly. skew, I wonder what we are using - let's use the e1071 package, which is already installed. It could be sensible here to add kurtosis, which is also in the same package. I wonder what we use for the mode? I suggest we add the modeest package that has an interesting mlv function (most likely value). It looks a thoughtful package and also has a function we may (later) add to the probabilities page to find the mode of a distribution. Aha it uses the statip package for data, with the functions mfv - most frequent value, and mfv1 for the first mode. So we could have 2 keys with mode(s) and mode1 as the labels.

These functions all have a consistent missing option. So that would solve the point b) above.

This needs a bit of discussion before implementation.

rdstern commented 5 years ago

If the above is ok, then here is my proposed (slightly changed) layout of this keyboard. There are just 2 new keys, namely kurtosis and mode1 (first mode - same as mode key is there is only a single mode.)

length, sum, min, max, range miss, mean, median, mode, mode1 non miss, var, sd, mad, IQR distinct, cv, mc, skew, kurtosis anyDup, propn, first, last, nth quantile, cor, cov (These 3 keys are in the middle of the bottom row.)

rdstern commented 5 years ago

@dannyparsons if you have time to check this is sensible, then @Wycklife could proceed?