Closed njtierney closed 3 years ago
Duplicate of #36, but I'll close #36 as this provides more detail.
OK cool, thanks, @mitchelloharawild !
Let me know if I can help.
I'm trying to work out the nicest output format for this, better ideas would be appreciated.
This is my current attempt (in parameters
branch, https://github.com/mitchelloharawild/distributional/commit/c579a2ec52d0561834548a94c59eef9d58b6c8b2), but I don't know of nice functions that work with unpacking named lists.
library(tidyverse)
library(distributional)
For a simple (single class) distribution vector.
dist <- dist_normal(1:3)
parameters(dist)
#> [[1]]
#> [[1]]$mu
#> [1] 1
#>
#> [[1]]$sigma
#> [1] 1
#>
#>
#> [[2]]
#> [[2]]$mu
#> [1] 2
#>
#> [[2]]$sigma
#> [1] 1
#>
#>
#> [[3]]
#> [[3]]$mu
#> [1] 3
#>
#> [[3]]$sigma
#> [1] 1
tibble(dist = dist, par = parameters(dist))
#> # A tibble: 3 × 2
#> dist par
#> <dist> <list>
#> 1 N(1, 1) <named list [2]>
#> 2 N(2, 1) <named list [2]>
#> 3 N(3, 1) <named list [2]>
For mixed distribution classes.
dist <- c(dist_normal(1:2), dist_poisson(3))
parameters(dist)
#> [[1]]
#> [[1]]$mu
#> [1] 1
#>
#> [[1]]$sigma
#> [1] 1
#>
#>
#> [[2]]
#> [[2]]$mu
#> [1] 2
#>
#> [[2]]$sigma
#> [1] 1
#>
#>
#> [[3]]
#> [[3]]$l
#> [1] 3
tibble(dist = dist, par = parameters(dist))
#> # A tibble: 3 × 2
#> dist par
#> <dist> <list>
#> 1 N(1, 1) <named list [2]>
#> 2 N(2, 1) <named list [2]>
#> 3 Pois(3) <named list [1]>
Created on 2021-09-14 by the reprex package (v2.0.0)
If you know all parameters of the same structure, you can cast to a tibble column using bind_rows()
:
library(tidyverse)
library(distributional)
For a simple (single class) distribution vector.
dist <- dist_normal(1:3)
parameters(dist)
#> [[1]]
#> [[1]]$mu
#> [1] 1
#>
#> [[1]]$sigma
#> [1] 1
#>
#>
#> [[2]]
#> [[2]]$mu
#> [1] 2
#>
#> [[2]]$sigma
#> [1] 1
#>
#>
#> [[3]]
#> [[3]]$mu
#> [1] 3
#>
#> [[3]]$sigma
#> [1] 1
tibble(dist = dist, par = bind_rows(parameters(dist)))
#> # A tibble: 3 × 2
#> dist par$mu $sigma
#> <dist> <dbl> <dbl>
#> 1 N(1, 1) 1 1
#> 2 N(2, 1) 2 1
#> 3 N(3, 1) 3 1
Created on 2021-09-14 by the reprex package (v2.0.0)
This also works for mixed classes, but gives NAs for mismatched names.
library(tidyverse)
library(distributional)
dist <- c(dist_normal(1:2), dist_poisson(3))
parameters(dist)
#> [[1]]
#> [[1]]$mu
#> [1] 1
#>
#> [[1]]$sigma
#> [1] 1
#>
#>
#> [[2]]
#> [[2]]$mu
#> [1] 2
#>
#> [[2]]$sigma
#> [1] 1
#>
#>
#> [[3]]
#> [[3]]$l
#> [1] 3
tibble(dist = dist, par = bind_rows(parameters(dist)))
#> # A tibble: 3 × 2
#> dist par$mu $sigma $l
#> <dist> <dbl> <dbl> <dbl>
#> 1 N(1, 1) 1 1 NA
#> 2 N(2, 1) 2 1 NA
#> 3 Pois(3) NA NA 3
Created on 2021-09-14 by the reprex package (v2.0.0)
I think this works well, having NAs is fine as it makes sense those parameters aren't there.
Perhaps there could be two versions of parameters
:
parameters
- returns a listparameters_dfr
- returns a row bound dataframe?It might even be more consistent with the design of the package (https://github.com/mitchelloharawild/distributional/issues/52#issuecomment-886370060) to only provide parameters_dfr()
as just parameters()
. I'd prefer to avoid the *_dfr()
suffix if possible.
Yeah I mean I think that's good, I do like a world of dataframes as output :)
I've changed parameters()
to return the data frame format, and also extended it to support more complex arguments (multivariate, matrix parameters, etc). Here is what we currently get:
library(distributional)
dist <- c(
dist_normal(1:2),
dist_poisson(3),
dist_multinomial(size = c(4, 3), prob = list(c(0.3, 0.5, 0.2), c(0.1, 0.5, 0.4)))
)
parameters(dist)
#> mu sigma l s p
#> 1 1 1 NA NA NULL
#> 2 2 1 NA NA NULL
#> 3 NA NA 3 NA NULL
#> 4 NA NA NA 4 0.3, 0.5, 0.2
#> 5 NA NA NA 3 0.1, 0.5, 0.4
tibble::as_tibble(parameters(dist))
#> # A tibble: 5 × 5
#> mu sigma l s p
#> <dbl> <dbl> <dbl> <dbl> <list>
#> 1 1 1 NA NA <NULL>
#> 2 2 1 NA NA <NULL>
#> 3 NA NA 3 NA <NULL>
#> 4 NA NA NA 4 <dbl [3]>
#> 5 NA NA NA 3 <dbl [3]>
Created on 2021-10-04 by the reprex package (v2.0.0)
Note the difference between missing lists and missing vectors, it looks strange but I think this is the best solution here.
Is there a way to extract out the distribution parameter values from a distributional object?
It would be handy to be able to extract this information out to explore distribution objects.
If this isn't already implemented, perhaps the function could be called something like
parameters
?Created on 2021-09-14 by the reprex package (v2.0.1)
Session info
``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.1.0 (2021-05-18) #> os macOS Big Sur 10.16 #> system x86_64, darwin17.0 #> ui X11 #> language (EN) #> collate en_AU.UTF-8 #> ctype en_AU.UTF-8 #> tz Australia/Perth #> date 2021-09-14 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0) #> backports 1.2.1 2020-12-09 [1] CRAN (R 4.1.0) #> cli 3.0.1 2021-07-17 [1] CRAN (R 4.1.0) #> colorspace 2.0-2 2021-06-24 [1] CRAN (R 4.1.0) #> crayon 1.4.1 2021-02-08 [1] CRAN (R 4.1.0) #> DBI 1.1.1 2021-01-15 [1] CRAN (R 4.1.0) #> digest 0.6.27 2020-10-24 [1] CRAN (R 4.1.0) #> distributional * 0.2.2 2021-02-02 [1] CRAN (R 4.1.0) #> dplyr 1.0.7 2021-06-18 [1] CRAN (R 4.1.0) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0) #> fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0) #> farver 2.1.0 2021-02-28 [1] CRAN (R 4.1.0) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0) #> generics 0.1.0 2020-10-31 [1] CRAN (R 4.1.0) #> ggplot2 3.3.5 2021-06-25 [1] CRAN (R 4.1.0) #> glue 1.4.2 2020-08-27 [1] CRAN (R 4.1.0) #> gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.0) #> highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) #> htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.1.0) #> knitr 1.33 2021-04-24 [1] CRAN (R 4.1.0) #> lifecycle 1.0.0 2021-02-15 [1] CRAN (R 4.1.0) #> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.0) #> pillar 1.6.2 2021-07-29 [1] CRAN (R 4.1.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0) #> purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.1.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0) #> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.0) #> rlang 0.4.11 2021-04-30 [1] CRAN (R 4.1.0) #> rmarkdown 2.9 2021-06-15 [1] CRAN (R 4.1.0) #> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0) #> scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.0) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.1.0) #> stringi 1.7.3 2021-07-16 [1] CRAN (R 4.1.0) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) #> styler 1.4.1 2021-03-30 [1] CRAN (R 4.1.0) #> tibble 3.1.3 2021-07-23 [1] CRAN (R 4.1.0) #> tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0) #> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0) #> vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0) #> withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.0) #> xfun 0.24 2021-06-15 [1] CRAN (R 4.1.0) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0) #> #> [1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library ```