Closed mjskay closed 3 years ago
When #25 is added, most (all?) distributions will be implemented using vctrs::new_rcrd()
which should fix this issue.
An NA handler could also be added to dist_default
to fix this.
Sweet, thanks!
While #25 should fix this issue, I don't think it'll ever be recommended to extract single elements from the distribution vector.
Doing this will cause issues with combining distributions of different classes, and potentially return lower-level programming oriented data structures. The return outputs of sub-distribution classes (such as dist_normal()
) is subject to change.
I see that {ggdist} uses indexing which drops the distribution
class, likely intentionally to obtain multiple density/cdf/quantile from a single function call. For now this is okay (but will cause issues with NA handling as above), but when this functionality is added to the distribution class it would be better/safer to maintain the distribution meta-class
Ah yes, I was meaning to ask about the distinction between the distribution class and the other classes. It currently makes programming against the data type a bit weird since normally in R data types there is no distinction between a vector type and 1 element of that vector type (eg to support distribution objects in a generic function, unless I've missed something, seems to require an implementation for distribution
and for dist_default
?). I wonder if [[
indexing should always return a distribution
object to maintain similar semantics to base types.
I also wonder if [[
should return a <distribution>
class, and for underlying structure to be obtained with vec_data()
instead. It's definitely an option to consider.
The <distribution>
class is more of a distribution manager, allowing distributions of different classes to be combined/vectorised. The role of the class will be more important when #25 is added, as it will group distributions which have the same attributes to make use of the native vectorised methods. Methods for generics are required because it informs how the function should be dispatched to its members and re-assemble the results appropriately.
The <dist_*>
classes implement the methods for a specific distribution.
The <dist_default>
methods provide defaults for <dist_*>
classes, of which many can be approximated from other methods.
Indexing via <dist>[[1]]
will no longer enter into the distribution object, and will instead give <dist>[1]
to preserve the 'distribution' class structure. To enter into the underlying distribution objects as before, you will need to use vec_data(<dist>)[[1]]
.
I expect this will be a breaking change for ggdist, let me know if you need me to stagger this change into a later release.
Ah cool, this is a great change!
Yeah I just tested against the github version and there are breakages. When are you planning to release?
Sidebar: could you bump your package version in DESCRIPTION on github so I can do version checks against the dev version? Thanks!
Ah cool, this is a great change!
:sunglasses:
Yeah I just tested against the github version and there are breakages. When are you planning to release?
The urgency on release is to get your PRs on CRAN - so no rush on my end.
Sidebar: could you bump your package version in DESCRIPTION on github so I can do version checks against the dev version? Thanks!
Oops - hadn't realised this wasn't done yet. Fixed in ea3df2f4d7c7e092dcba4e349dc3266fa08bf21c
The urgency on release is to get your PRs on CRAN - so no rush on my end.
Ah k cool. I just made a new release so I'll probably wait a couple weeks to not make CRAN mad at me for submitting too often. I'll ping back when I do.
Oops - hadn't realised this wasn't done yet. Fixed in ea3df2f
Sweet thanks!
Applying is.na over a vector of distributions gives the expected result:
But if you extract individual elements, is.na() seems to be applied over the arguments:
I'm running into this in
ggdist
because (at a high level) I need to do an equivalent of lapply over a vector of distributions, but need different behavior for NA values compared to non-NA values. It appears that the problem is extracting an NA from such a vector yieldsNULL
instead of a missing value:It's easy enough for me to work around, but seems like a possible inconsistency in the API.