stan-dev / math

The Stan Math Library is a C++ template library for automatic differentiation of any order using forward, reverse, and mixed modes. It includes a range of built-in functions for probabilistic modeling, linear algebra, and equation solving.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
743 stars 187 forks source link

Request for vector return from normal_cdf given vector arguments #2468

Open adkinsty opened 3 years ago

adkinsty commented 3 years ago

I wish to execute the normal CDF on vector inputs to obtain a vector of cumulative probabilities. However, I am getting a dimension mismatch error. The error says the return on the right-hand side is of type “real”. The Stan documentation says that the cdf function accepts “reals” arguments and has “reals” returns. I thought this psuedotype “reals” included vectors yet it appears to be returning a scalar.

Example code:

vector[N] mu;
vector[N] sigma;
vector[N] x;
vector[N] phi;

phi = normal_cdf(x, mu, sigma);

Example error:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
Dimension mismatch in assignment; variable name = phi, type = vector; right-hand side type = real.
Illegal statement beginning with non-void expression parsed as
  phi
SteveBronder commented 3 years ago

Yes all the lpdf and cdf functions return back a scalar. A real in the stan language is a scalar

It would be nice to have a vlpdf style function that returned a vector but we haven't had time to implement it

spinkney commented 3 years ago

This would be nice to have for all the copula stuff as well. As the copulas use the marginal cdfs evaluated at each data point.

wds15 commented 3 years ago

But I think there would not really be a performance gain..so these functions can live in Stan language is what I would think.

spinkney commented 3 years ago

But I think there would not really be a performance gain..so these functions can live in Stan language is what I would think.

+1 ^

also, the row/col-wise framework proposed by @andrjohns would solve this issue.

wlandau commented 1 year ago

This feature would really help when we need access to individual observation-level log likelihoods but still want to take advantage of the vectorization in the _lpdf functions. Example: https://arxiv.org/abs/2209.09190

hansvancalster commented 1 week ago

It would also help in case of interval censoring. See this comment: https://github.com/paul-buerkner/brms/issues/1657#issuecomment-2367156113

SteveBronder commented 1 week ago

I have a PR that examples how we can do this. The main thing is it's a huge rewrite of our distributions that would be a lot of elbow grease effort. It would be nice to have a Stan hackathon to go through the distributions with people and do this split up over several people. We also need a test suite change so that all these are tested for vector returns

https://github.com/stan-dev/math/pull/2751