stan-dev / math

The Stan Math Library is a C++ template library for automatic differentiation of any order using forward, reverse, and mixed modes. It includes a range of built-in functions for probabilistic modeling, linear algebra, and equation solving.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
751 stars 188 forks source link

unit_vector transform should work with zero-length unconstrained #2569

Open bob-carpenter opened 3 years ago

bob-carpenter commented 3 years ago

Description

Unit vector transforms should work with a zero-length unconstrained vector.

Example

The size-1 unit vector is just a constant [1]'. This should be the result of transforming the unconstrained 0-vector []'.

Expected Output

Stan programs that use:

parameters {
  unit_vector[1] alpha;
}

should lead to alpha == [1]'.

Current Version:

v4.1.0

PracticalMetal commented 3 years ago

Hi, I would like to work on this. Can you suggest me ways to get started?

nhuurre commented 3 years ago

The expected change is adding if (size(y)==0) branches in stan/math/prim/fun/unit_vector_constrain.hpp and a test case in test/unit/math/prim/fun/unit_vector_constrain.hpp.

However, I think the issue is wrong. As currently implemented, the constraining transform does not change the size of the vector. (That's not ideal but we need some hack to match the constrained and unconstrained topologies.) So size-0 unconstrained vector should result not in a size-1 but a size-0 unit vector. And there is no such thing.

Furthermore, the current implementation handles size-1 vectors correctly. Keep in mind that [1]' is not the only size-1 unit vector, there's also [-1]'. The example program compiles but when you run it you get

Exception: Found dimension size one in unit vector declaration. One-dimensional unit vector is discrete but the target distribution must be continuous. variable=alpha; dimension size expression=1 (in 'example.stan', line 2, column 14 to column 15)

bob-carpenter commented 3 years ago

Thanks for the correction, @nhuure---you're absolutely right.

It's too bad we have to waste a parameter on it, but I suppose we have to for uniformity. It'll just be a single normally distributed parameter that always maps to [1]'. There's very little overhead in an additional standard normal component of the density.