stan-dev / posterior

The posterior R package
https://mc-stan.org/posterior/
Other
167 stars 24 forks source link

In summarise draws, a function that returns a character changes all columns to character #355

Open n-kall opened 8 months ago

n-kall commented 8 months ago

If a summary function that returns a character is used in summarise_draws, it makes all columns <chr> which messes up formatting.

example_draws() |>
   summarise_draws(x, char_fun = \(x) "word", mean)

Returns

# A tibble: 10 × 3
   variable char_fun      mean
   <chr>    <chr>         <chr>
 1 mu       word          4.1799990610081
 2 tau      word          4.16356885610418
 3 theta[1] word          6.74893947963962
 4 theta[2] word          5.25331634990726
 5 theta[3] word          3.04393475572924
 6 theta[4] word          4.85842854299997
 7 theta[5] word          3.22258991784476
 8 theta[6] word          3.98696993634695
 9 theta[7] word          6.5030995214442
10 theta[8] word          4.56520199862817

This could be fixed by using type.convert(out, as.is = TRUE) on the output tibble in summarise_draws_helper

mjskay commented 8 months ago

This suggests to me that the output is being bound into a matrix (which requires a single type) and then converted to a data frame later, which it should probably not be. Rather than potentially round-tripping through a string (with the possibility for information loss) I would suggest binding results together into a data frame directly, which should be more efficient anyway.

n-kall commented 8 months ago

This suggests to me that the output is being bound into a matrix

Yes, this is indeed the case

paul-buerkner commented 8 months ago

I agree. that would make sense

n-kall @.***> schrieb am Do., 14. März 2024, 18:52:

This suggests to me that the output is being bound into a matrix

Yes, this is indeed the case

— Reply to this email directly, view it on GitHub https://github.com/stan-dev/posterior/issues/355#issuecomment-1997904615, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCW2ACGQU7UVGXE5IGYAHDYYHIVHAVCNFSM6AAAAABEVXULC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJXHEYDINRRGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

n-kall commented 8 months ago

It seems this is also caused by the behaviour of unlist, which results in a vector, and when performed on a list with different types inside, will convert them all to the same (in this case character). This unlisting is done to handle summary functions that return multiple elements (e.g. quantile2).

However, I don't know of a base R equivalent to purrr::flatten which would do what we want in this case (returning a list, but without nesting).