mitchelloharawild / distributional

Vectorised distributions for R
https://pkg.mitchelloharawild.com/distributional
GNU General Public License v3.0
94 stars 15 forks source link

mean/variance may be confusing #20

Closed hongooi73 closed 3 years ago

hongooi73 commented 4 years ago

@mitchelloharawild thanks for pointing me to this development in tidyverts. Some great work here.

Just a question though: you use mean and variance to extract these measures from distributions, and they return vectors when used on a list of distribution objects. I am not sure these are the best names for the generics to use; while variance might be ok, mean when used in any other context returns a scalar, which is the mean of all inputs combined. Something like sum(mean(distrib)) to return the total predicted forecasts for a set of rows in a tsibble is unexpected.

Of course, I'm also not sure on the best name to use in its place. dist_mean maybe? Or maybe m1 and m2 to refer to the 1st and 2nd moments?

mitchelloharawild commented 4 years ago

I think moments is deserving of another function, which accepts a parameter for the order of the moment. I had similar concerns when using mean(), especially as you could potentially 'average' a distribution / random variable. An alternative for this is expectation(), but the term mean() is very nice to work with. I think I'll need to poll this for opinions.

mitchelloharawild commented 4 years ago

I've polled around a few places and opinions seem mixed. For now I think it's simplest to stick with mean() and it can be fairly easily soft-deprecated at a later date in case others find it confusing.

mitchelloharawild commented 3 years ago

Opinions are mixed here, but I am closing the issue now as the package is available on CRAN and the mean()/variance() function naming will be kept indefinitely. As it doesn't make much sense to return a single valued object when mean(<distribution>) or variance(<distribution>) is used, this functionality will not change. At worst, if the naming of the functions proves confusing in the future and we pluralise these names, the existing functions would be aliases possibly with a warning.