mitchelloharawild / distributional

Vectorised distributions for R
https://pkg.mitchelloharawild.com/distributional
GNU General Public License v3.0
97 stars 15 forks source link

Getting levels of categorical distributions and samples #89

Closed mjskay closed 1 year ago

mjskay commented 1 year ago

Is there / should there be a way to get the levels in the support of a categorical distribution (or in a sample from one?).

Currently support() does not list the levels:

x = dist_categorical(list(1:5), list(letters[1:5]))
unclass(support(x))
#> $x
#> $x[[1]]
#> character(0)
#> 
#> 
#> $lim
#> $lim[[1]]
#> [1] NA

It would be nice to be able to get the list of unique values in the support, maybe here (perhaps in $lim)? That would be consistent with how ggplot uses "limits" on discrete versus continuous scales, for example. I could also see something like a "levels" field.

Relatedly, if we create a dist_sample() on a character, factor, or ordered vector, support() fails:

support(dist_sample(list(letters[1:5])))
#> Error in (1 - h) * qs[i]: non-numeric argument to binary operator
support(dist_sample(list(factor(letters[1:5]))))
#> Error in quantile.default(x$x, probs = p, ..., na.rm = na.rm, names = FALSE): (unordered) factors are not allowed
support(dist_sample(list(ordered(letters[1:5]))))
#> Error in quantile.default(x$x, probs = p, ..., na.rm = na.rm, names = FALSE): 'type' must be 1 or 3 for ordered factors

Happy to submit a PR implementing the desired solution.

mitchelloharawild commented 1 year ago

Thanks! Not sure what the best interface would be for this, but support() would be a sensible place for it. I can do the implementation of this one.

mjskay commented 1 year ago

Thanks!

mitchelloharawild commented 1 year ago

Improved method for support() on categorical distributions has been added (via the ggplot2 approach), the issue with sample distributions is a larger issue that I've moved to #91.

mjskay commented 1 year ago

awesome thanks!