pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.49k stars 1.97k forks source link

Implement several RandomVariables as SymbolicRandomVariables #7239

Closed ricardoV94 closed 2 months ago

ricardoV94 commented 3 months ago

Description

Implement several RandomVariables as SymbolicRandomVariables

This allows sampling from multiple backends without having to dispatch for each one

Also:

Related Issue

Checklist

Type of change


📚 Documentation preview 📚: https://pymc--7239.org.readthedocs.build/en/7239/

ricardoV94 commented 3 months ago

Seems like this is mostly shuffling around and simplifying stuff that already exists, so if tests pass it looks great.

There are two commits in this PR. The first one shuffles and refactors things so that the second commit (converting existing RandomVariables to SymbolicRandomVariables) is done mostly without hassle.

I'm a bit unclear on how the signatures for RVs work, was that changed in this PR or another one? The square brackets in particular throw me off.

I think this is the kind of signature we should use in PyTensor, so I was testing it here. The numpy vectorize signature doesn't really handle stuff like rng, size, axis, so I think we should take the initative here. In a previous PR I pretend those were scalars, so stuff looked like (),(),(),()->(),(), which is both fake and I think less readable than [rng],[size],(),()->[rng],(). Wdyt? This would ultimately replace the ndim_supp and ndims_params which is very strict and only works because we assume all RVs have the same non-numerical signature. This is not useful for SymbolicRandomVariables that can have multiple (or zero) RNGs and only optionally size.

There's also some opportunities for refactoring duplicated code.

I'm weary of some helpers, but I can optimize for a bit less DRY

ricardoV94 commented 3 months ago

@jessegrabowski I simplified the logic needed to build standard SymbolicRandomVariables, what do you think?