stan-dev / math

The Stan Math Library is a C++ template library for automatic differentiation of any order using forward, reverse, and mixed modes. It includes a range of built-in functions for probabilistic modeling, linear algebra, and equation solving.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
751 stars 188 forks source link

Vectorize everything with broadcasting #202

Closed bob-carpenter closed 8 years ago

bob-carpenter commented 8 years ago

Vectorize all functions for matching shapes and allow broadcasting of lower-dimensional types.

This will require

There is an example for the function inv() in stan/math/prim/mat/fun/inv.hpp in the branch feature/issue-202-vectorize-all.

This issue involves the first step above for all of the Stan functions, so it may need to be broken down into stages, starting with unary functions.

For the stan-dev/stan issue: https://github.com/stan-dev/stan/issues/1683

For each of thee functions, we need (1) functor, (2) general template definition, (3) doc in manual, (4) signatures [need function for this], (5) signature tests for all instantiations.

When I did the example for inv(), I had to remove the templating for inv() itself and replace it with double to remove the ambiguity. For functions that only have a template version (i.e., there is not also a version for fwd and rev), I believe there will need to be an enable_if on the base definition that restricts application to the base class.

We'll also need a testing framework. This should be easy to do following the way apply_scalar_unary is defined (across prim, rev, and fwd). It will require a simple templated testing functor for each function being tested.

(int):int

(int,int):int

(int,real):real

(real):int

(real):real

(real,real):int

(real,real):real

(int, real):real

(real,real,real):real

(int,real,real):real

rayleigh commented 8 years ago

I'm reopening this issue because the pull request that was merged in only provides the infrastructure to test these vectorize functions. However, if you think that we should close this issue and open a new one, I can do that.

bob-carpenter commented 8 years ago

Either way is OK. You need to make the checklist of all unary functions to which this applies. You can probably knock them off in multiple issues.

On Mar 15, 2016, at 4:52 PM, rayleigh notifications@github.com wrote:

I'm reopening this issue because the pull request that was merged in only provides the infrastructure to test these vectorize functions. However, if you think that we should close this issue and open a new one, I can do that.

— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub

rayleigh commented 8 years ago

Okay. Copying over your response to a question of where to put files:

  1. The function definitions so should go here:
stan/math/prim/mat/fun/<function-name>.hpp
  1. For functions already partly vectorized, you need to remove the original vectorized definitions (just get rid of them and their associated tests).
  2. The function signatures class is going to need a utility method to declare these, presumably something like ``add_unary_vectorized("exp") and so on.

This should declare all the basic vectorizations and perhaps up to 4 deep for arrays of all types.

  1. You're going to need to write (or have me write) a. a new intro section to the functions guide in the manual, with some new syntax for this vectorization, and b. update the doc for each of the added functions

Also, where should the test files go? Should they go in /test/unit/math/mix/mat/fun/<function-name>_test.hpp?

bob-carpenter commented 8 years ago

Thanks and yes, that's the right place for the test files, because they depend on mix.

On Mar 17, 2016, at 10:30 AM, rayleigh notifications@github.com wrote:

Okay. Copying over your response to a question of where to put files:

• The function definitions so should go here: stan/math/prim/mat/fun/.hpp

• For functions already partly vectorized, you need to remove the original vectorized definitions (just get rid of them and their associated tests).

• The function signatures class is going to need a utility method to declare these, presumably something like ``add_unary_vectorized("exp") and so on.

This should declare all the basic vectorizations and perhaps up to 4 deep for arrays of all types.

• You're going to need to write (or have me write) a. a new intro section to the functions guide in the manual, with some new syntax for this vectorization, and b. update the doc for each of the added functions Also, where should the test files go? Should they go in /test/unit/math/mix/mat/fun/_test.hpp?

— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub

rayleigh commented 8 years ago

Thanks. Looking through the 2.8.0 Stan manual, I noticed that abs is to be deprecated. Should I vectorize it or should I only vectorize fabs?

bob-carpenter commented 8 years ago

Ack, I think we undeprecated abs and are going to deprecate fabs (to make it more like math and less like C++). So go ahead and do both and we can see where the chips land.

On Mar 21, 2016, at 11:45 AM, rayleigh notifications@github.com wrote:

Thanks. Looking through the 2.8.0 Stan manual, I noticed that abs is to be deprecated. Should I vectorize it or should I only vectorize fabs?

— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub

rayleigh commented 8 years ago

Sounds good; I vectorized both. I've been trying to vectorize the function cbrt. It's defined for 0, but its derivatives aren't (it returns NaN). From this, I realized that the testing framework assumed that if a function is defined for a value, its derivatives are as well. To handle this, does the vectorize function throw an error if an user enters 0? Or, do I write separate code to test this case?

syclik commented 8 years ago

I think we should test derivatives.

We had this discussion in the past regarding defined values and undefined derivatives. I think the functions should trap that error and fail early if possible, even if it conflicts with other implementations of the function. If we're using these functions primarily where they need derivatives, it makes no sense to knowingly propagate NaNs.

On Mon, Mar 21, 2016 at 7:00 PM, rayleigh notifications@github.com wrote:

Sounds good; I vectorized both. I've been trying to vectorize the function cbrt. It's defined for 0, but its derivatives aren't (it returns NaN). From this, I realized that the testing framework assumed that if a function is defined for a value, its derivatives are as well. To handle this, does the vectorize function throw an error if an user enters 0? Or, do I write separate code to test this case?

— You are receiving this because you modified the open/close state. Reply to this email directly or view it on GitHub https://github.com/stan-dev/math/issues/202#issuecomment-199525782

rayleigh commented 8 years ago

Okay. Just to clarify, should the vectorize-all function or the individual scalar functions include the error checking? Or, both?

syclik commented 8 years ago

That was a general comment.

What test am I supposed to run? Can you provide the full path?

On Tue, Mar 22, 2016 at 11:34 AM, rayleigh notifications@github.com wrote:

Okay. Just to clarify, should the vectorize-all function or the individual scalar functions include the error checking? Or, both?

— You are receiving this because you modified the open/close state. Reply to this email directly or view it on GitHub https://github.com/stan-dev/math/issues/202#issuecomment-199868918

rayleigh commented 8 years ago

I pushed more changes to the branch feature/issue-202-vectorize-all. However, I only wanted to commit things that worked. So, if you pull that branch, you'll need to change test/unit/math/mix/mat/fun/cbrt_test.cpp before running it. In the function valid_inputs, change any one of the values to 0.

rayleigh commented 8 years ago

I pushed the change to exp.hpp to feature/issue-202-vectorize-all. The test to run is test/unit/math/mix/mat/vectorize/foo_test.cpp.

bob-carpenter commented 8 years ago

Is it only exp() that's causing the issue?

On Mar 22, 2016, at 2:28 PM, rayleigh notifications@github.com wrote:

I pushed the change to exp.hpp to feature/issue-202-vectorize-all. The test to run is test/unit/math/mix/mat/vectorize/foo_test.cpp.

— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub

syclik commented 8 years ago

No. This isn't an include-order issue.

Here's where I'm at so far with exp:

  1. There's a definition of a struct exp_fun in stan/math/prim/scal/mat/fun/exp.hpp.
  2. There is a definition of a templated function exp that should handle the vectorization in stan/math/prim/mat/fun/exp.hpp.
  3. There is a definition of a stan::math::var exp(const stan::math::var& a) in stan/math/rev/scal/fun/exp.hpp.
  4. When calling foo_test.cpp, which should really be changed to exp_test.cpp, with the appropriate changes, this is the compiler error message that crops up:
In file included from test/unit/math/mix/mat/vectorize/foo_test.cpp:1:
In file included from ./stan/math/mix/mat.hpp:12:
In file included from ./stan/math/prim/mat.hpp:91:
In file included from ./stan/math/prim/mat/fun/exp.hpp:4:
./stan/math/prim/mat/vectorize/apply_scalar_unary.hpp:40:41: error:
implicit instantiation of
      undefined template 'Eigen::internal::traits<stan::math::var>'
      typedef typename Eigen::internal::traits<T>::Scalar scalar_t;
                                        ^
./stan/math/prim/mat/fun/exp.hpp:23:21: note: in instantiation of template
class
      'stan::math::apply_scalar_unary<stan::math::exp_fun,
stan::math::var>' requested here
    inline typename apply_scalar_unary<exp_fun, T>::return_t
                    ^
./stan/math/rev/scal/fun/grad_inc_beta.hpp:28:16: note: while substituting
deduced template arguments
      into function template 'exp' [with T = stan::math::var]
      var c3 = exp(lbeta(a, b)) * inc_beta(a, b, z);
               ^
lib/eigen_3.2.4/Eigen/src/Core/util/ForwardDeclarations.h:17:29: note:
template is declared here
template<typename T> struct traits;
                            ^
In file included from test/unit/math/mix/mat/vectorize/foo_test.cpp:1:
In file included from ./stan/math/mix/mat.hpp:12:
In file included from ./stan/math/prim/mat.hpp:91:
In file included from ./stan/math/prim/mat/fun/exp.hpp:4:
  1. That compiler error is coming from the template specialization of apply_scalar_unary() not being used from stan/math/rev/mat/vectorize/apply_scalar_unary.hpp.

I think if I can figure out why that specialization is not being used, it would go a long way.

Daniel

On Tue, Mar 22, 2016 at 11:09 PM, Bob Carpenter notifications@github.com wrote:

Is it only exp() that's causing the issue?

  • Bob

On Mar 22, 2016, at 2:28 PM, rayleigh notifications@github.com wrote:

I pushed the change to exp.hpp to feature/issue-202-vectorize-all. The test to run is test/unit/math/mix/mat/vectorize/foo_test.cpp.

— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub

— You are receiving this because you modified the open/close state. Reply to this email directly or view it on GitHub https://github.com/stan-dev/math/issues/202#issuecomment-200146513

syclik commented 8 years ago

Ok. Those headers aren't included in stan/math/prim/mat.hpp or rev/mat.hpp or fwd/mat.hpp.

I'll fix that now.

On Tue, Mar 22, 2016 at 11:22 PM, Daniel Lee bearlee@alum.mit.edu wrote:

No. This isn't an include-order issue.

Here's where I'm at so far with exp:

  1. There's a definition of a struct exp_fun in stan/math/prim/scal/mat/fun/exp.hpp.
  2. There is a definition of a templated function exp that should handle the vectorization in stan/math/prim/mat/fun/exp.hpp.
  3. There is a definition of a stan::math::var exp(const stan::math::var& a) in stan/math/rev/scal/fun/exp.hpp.
  4. When calling foo_test.cpp, which should really be changed to exp_test.cpp, with the appropriate changes, this is the compiler error message that crops up:
In file included from test/unit/math/mix/mat/vectorize/foo_test.cpp:1:
In file included from ./stan/math/mix/mat.hpp:12:
In file included from ./stan/math/prim/mat.hpp:91:
In file included from ./stan/math/prim/mat/fun/exp.hpp:4:
./stan/math/prim/mat/vectorize/apply_scalar_unary.hpp:40:41: error:
implicit instantiation of
      undefined template 'Eigen::internal::traits<stan::math::var>'
      typedef typename Eigen::internal::traits<T>::Scalar scalar_t;
                                        ^
./stan/math/prim/mat/fun/exp.hpp:23:21: note: in instantiation of template
class
      'stan::math::apply_scalar_unary<stan::math::exp_fun,
stan::math::var>' requested here
    inline typename apply_scalar_unary<exp_fun, T>::return_t
                    ^
./stan/math/rev/scal/fun/grad_inc_beta.hpp:28:16: note: while substituting
deduced template arguments
      into function template 'exp' [with T = stan::math::var]
      var c3 = exp(lbeta(a, b)) * inc_beta(a, b, z);
               ^
lib/eigen_3.2.4/Eigen/src/Core/util/ForwardDeclarations.h:17:29: note:
template is declared here
template<typename T> struct traits;
                            ^
In file included from test/unit/math/mix/mat/vectorize/foo_test.cpp:1:
In file included from ./stan/math/mix/mat.hpp:12:
In file included from ./stan/math/prim/mat.hpp:91:
In file included from ./stan/math/prim/mat/fun/exp.hpp:4:
  1. That compiler error is coming from the template specialization of apply_scalar_unary() not being used from stan/math/rev/mat/vectorize/apply_scalar_unary.hpp.

I think if I can figure out why that specialization is not being used, it would go a long way.

Daniel

On Tue, Mar 22, 2016 at 11:09 PM, Bob Carpenter notifications@github.com wrote:

Is it only exp() that's causing the issue?

  • Bob

On Mar 22, 2016, at 2:28 PM, rayleigh notifications@github.com wrote:

I pushed the change to exp.hpp to feature/issue-202-vectorize-all. The test to run is test/unit/math/mix/mat/vectorize/foo_test.cpp.

— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub

— You are receiving this because you modified the open/close state. Reply to this email directly or view it on GitHub https://github.com/stan-dev/math/issues/202#issuecomment-200146513

syclik commented 8 years ago

I just pushed, but it's not working yet. Bob, I might need your help reasoning through this.

apply_scalar_unary() in prim is being called first. Even when the mix header is included. That gets instantiated with stan::math::var as its second argument instead of being found with argument dependent lookup. If you want more detail as to the include order, try running the test (it's ok if it doesn't compile), then open up test/unit/math/mix/vectorize/foo_test.d. That'll show you the order in which files get opened.

If you include the rev version as the first thing in the test file, it works.

I'm not sure how to get past this problem. I'm sure we've figured this out in the past.

On Tue, Mar 22, 2016 at 11:24 PM, Daniel Lee bearlee@alum.mit.edu wrote:

Ok. Those headers aren't included in stan/math/prim/mat.hpp or rev/mat.hpp or fwd/mat.hpp.

I'll fix that now.

On Tue, Mar 22, 2016 at 11:22 PM, Daniel Lee bearlee@alum.mit.edu wrote:

No. This isn't an include-order issue.

Here's where I'm at so far with exp:

  1. There's a definition of a struct exp_fun in stan/math/prim/scal/mat/fun/exp.hpp.
  2. There is a definition of a templated function exp that should handle the vectorization in stan/math/prim/mat/fun/exp.hpp.
  3. There is a definition of a stan::math::var exp(const stan::math::var& a) in stan/math/rev/scal/fun/exp.hpp.
  4. When calling foo_test.cpp, which should really be changed to exp_test.cpp, with the appropriate changes, this is the compiler error message that crops up:
In file included from test/unit/math/mix/mat/vectorize/foo_test.cpp:1:
In file included from ./stan/math/mix/mat.hpp:12:
In file included from ./stan/math/prim/mat.hpp:91:
In file included from ./stan/math/prim/mat/fun/exp.hpp:4:
./stan/math/prim/mat/vectorize/apply_scalar_unary.hpp:40:41: error:
implicit instantiation of
      undefined template 'Eigen::internal::traits<stan::math::var>'
      typedef typename Eigen::internal::traits<T>::Scalar scalar_t;
                                        ^
./stan/math/prim/mat/fun/exp.hpp:23:21: note: in instantiation of
template class
      'stan::math::apply_scalar_unary<stan::math::exp_fun,
stan::math::var>' requested here
    inline typename apply_scalar_unary<exp_fun, T>::return_t
                    ^
./stan/math/rev/scal/fun/grad_inc_beta.hpp:28:16: note: while
substituting deduced template arguments
      into function template 'exp' [with T = stan::math::var]
      var c3 = exp(lbeta(a, b)) * inc_beta(a, b, z);
               ^
lib/eigen_3.2.4/Eigen/src/Core/util/ForwardDeclarations.h:17:29: note:
template is declared here
template<typename T> struct traits;
                            ^
In file included from test/unit/math/mix/mat/vectorize/foo_test.cpp:1:
In file included from ./stan/math/mix/mat.hpp:12:
In file included from ./stan/math/prim/mat.hpp:91:
In file included from ./stan/math/prim/mat/fun/exp.hpp:4:
  1. That compiler error is coming from the template specialization of apply_scalar_unary() not being used from stan/math/rev/mat/vectorize/apply_scalar_unary.hpp.

I think if I can figure out why that specialization is not being used, it would go a long way.

Daniel

On Tue, Mar 22, 2016 at 11:09 PM, Bob Carpenter <notifications@github.com

wrote:

Is it only exp() that's causing the issue?

  • Bob

On Mar 22, 2016, at 2:28 PM, rayleigh notifications@github.com wrote:

I pushed the change to exp.hpp to feature/issue-202-vectorize-all. The test to run is test/unit/math/mix/mat/vectorize/foo_test.cpp.

— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub

— You are receiving this because you modified the open/close state. Reply to this email directly or view it on GitHub https://github.com/stan-dev/math/issues/202#issuecomment-200146513

rayleigh commented 8 years ago

Any luck getting past this problem?

rayleigh commented 8 years ago

Copying this over from an email chain because it might have the solution (I haven't tried yet):

If you edit out the very first include line in:

/test/unit/math/mix/mat/vectorize/foo_test.cpp,

everything passes. So the new file looks like:

---------------------
// #include <stan/math/mix/mat.hpp>
#include <gtest/gtest.h>
#include <test/unit/math/prim/mat/vectorize/prim_scalar_unary_test.hpp>
#include <test/unit/math/rev/mat/vectorize/rev_scalar_unary_test.hpp>
#include <test/unit/math/fwd/mat/vectorize/fwd_scalar_unary_test.hpp>
#include <test/unit/math/mix/mat/vectorize/mix_scalar_unary_test.hpp>
#include <test/unit/math/prim/mat/vectorize/foo_fun.hpp>
#include <test/unit/math/prim/mat/vectorize/vector_builder.hpp>

/**
 * This is the structure for testing mock function foo (defined in the
 * testing framework).  See README.txt for more instructions.
 */
struct foo_test {
-------------------------------

This makes me think there may be a problem in the include order in mix/mat.hpp.

You might want to read this:

http://stackoverflow.com/questions/14626033/implicit-instantiation-of-undefined-template-class

It's all about steps 1, 2, and 3 in the answer.

rayleigh commented 8 years ago

Other questions:

1) I tried vectorizing trunc.hpp, but the matrix tests fail to run for it. When I looked at boost/math/special_functions/trunc.hpp, I saw the following comment:

// The following functions will not compile unless T has an
// implicit convertion to the integer types. 

So, is trunc.hpp designed to work on matrix?

2) According to the manual, the function step returns a real, but when I look at the prim/scal/fun/step.hpp, it returns an int. Should I change the function to return a real?

3) For cbrt, I know that I asked about it above, but I want to clarify what I should do about it. In the tests, should I allow the code to pass if both the expected and tested derivatives are NaN?

rayleigh commented 8 years ago

Listing why certain unary functions weren't vectorized: Unvectorized acosh and atanh because error handling isn't the same across all modes. Asinh looks to have similar coding problems as acosh and atanh, but it's defined for all values so I'm leaving it vectorized.

Unvectorized exp2, log2, and log1p because of an ambiguity error when making Stan model's C++ code for untemplated prim versions.

trunc and step aren't vectorized because both aren't compatible with matrices.

Other (real):real functions aren't vectorized because they're missing a mode and their prim version can't be untemplated.

bob-carpenter commented 8 years ago

Let's just get what you have working in and then create new issues for these other cases. The other four cases with solutions are as follows:

  1. acosh etc (varying failure conditions in prim/rev/fwd) generalize testing or change error handling.
  2. exp2 etc (defined only in ::) just need to resolve instantiation ambiguity as we've done for other cases, perhaps with enable_if, perhaps with appropriate qualification in generated code (see existing case of max() in generator).
  3. step etc. (integer returns) just need appropriate enable_if to remove matrix arguments; or they can be defined and just not included in function_signatures in the Stan language; there's no API problem with allowing Matrix<int,...>.
  4. others (?) (missing fwd or rev, so need templated prim case) easy fix to define prim case for double and then directly define fwd and rev without templates (well, fwd needs a template, but the whole arg's not templated) a more general fix requiring less code dup would be to disable_if the templated prim case to only apply to scalars (which is easy in theory, but may be tricky in practice to deal with our include orders among prim/fwd/rev and scal/arr/mat.
    • Bob

On Apr 11, 2016, at 2:50 PM, rayleigh notifications@github.com wrote:

Listing why certain unary functions weren't vectorized: Unvectorized acosh and atanh because error handling isn't the same across all modes. Asinh looks to have similar coding problems as acosh and atanh, but it's defined for all values so I'm leaving it vectorized.

Unvectorized exp2, log2, and log1p because of an ambiguity error when making Stan model's C++ code for untemplated prim versions.

trunc and step aren't vectorized because both aren't compatible with matrices.

Other (real):real functions aren't vectorized because they're missing a mode and their prim version can't be untemplated.

— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub

rayleigh commented 8 years ago

Sounds good though I disagree about the solution to 1: 1) In the rev version of acosh.hpp, you have:

namespace {
      class acosh_vari : public op_v_vari {
      public:
        acosh_vari(double val, vari* avi) :
          op_v_vari(val, avi) {
        }
        void chain() {
          avi_->adj_ += adj_ / std::sqrt(avi_->val_ * avi_->val_ - 1.0);
        }
      };
    }

In the rev version of digamma.hpp (which also uses code from boost/math/special_functions), you have

namespace {
      class digamma_vari : public op_v_vari {
      public:
        explicit digamma_vari(vari* avi) :
          op_v_vari(digamma(avi->val_), avi) {
        }
        void chain() {
          avi_->adj_ += adj_ * trigamma(avi_->val_);
        }
      };
    }

The vectorized version of digamma handles error consistently across all versions so I think the rev and fwd mode code needs to be fixed for acosh, atanh, and asinh.

bob-carpenter commented 8 years ago

Here's the actual function for acosh:

inline var acosh(const var& a) { if (boost::math::isinf(a.val()) && a > 0.0) return var(new acoshvari(a.val(), a.vi)); return var(new acoshvari(::acosh(a.val()), a.vi)); }

We could move the call to ::acosh() into acosh_vari constructor, but that wouldn't change the error handling. Or is it the conditioning on isinf that's the issue?

So I'm not sure what you're suggesting or where there is currently an inconsistency.

rayleigh commented 8 years ago

Thanks for pointing that out. I took another look and I think the includes might be the issue because for acosh.hpp, the includes are:

#include <math.h>
#include <stan/math/rev/core.hpp>
#include <boost/math/special_functions/fpclassify.hpp>
#include <cmath>

#ifdef _MSC_VER
#include <boost/math/special_functions/acosh.hpp>
using boost::math::acosh;
#endif

So, unless it's being compiled on a Visual C++ compiler, I don't think it'll use acosh.hpp from boost/math/special_functions, which is providing the error handling. Because Stan doesn't use C++11 and acosh.hpp is only available in C++11's cmath library, the vectorized version of acosh.hpp uses acosh.hpp from boost/math/special_functions as the base function. I think whether boost/math/special_functions/acosh.hpp is included explains the inconsistency in error handling that I'm seeing.

bob-carpenter commented 8 years ago

That conditional include is nasty. Maybe we should just use boost::math::acosh independently of platform.

Is there ever a reason to include both and ? And doesn't that include have to go last?

The error handling can be fixed by just adding it to the code, even if we do use Boost's definition for computing non-error inputs.

On Apr 11, 2016, at 5:40 PM, rayleigh notifications@github.com wrote:

Thanks for pointing that out. I took another look and I think the includes might be the issue because for acosh.hpp, the includes are:

include

include <stan/math/rev/core.hpp>

include <boost/math/special_functions/fpclassify.hpp>

include

ifdef _MSC_VER

include <boost/math/special_functions/acosh.hpp>

using boost::math::acosh;

endif

So, unless it's being compiled on a Visual C++ compiler, I don't think it'll use acosh.hpp from boost/math/special_functions, which is providing the error handling. Because Stan doesn't use C++11 and acosh.hpp is only available in C++11's cmath library, the vectorized version of acosh.hpp uses acosh.hpp from boost/math/special_functions as the base function. I think whether boost/math/special_functions/acosh.hpp is included explains the inconsistency in error handling that I'm seeing.

— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub

bob-carpenter commented 8 years ago

@rayleigh Is the current checklist above up to date? Is there a reason the dozen or so (real): real signature ones aren't implemented?

bob-carpenter commented 8 years ago

Closing in favor of newer issue #347 given that it's partially done.

jessexknight commented 1 year ago

Can we reopen this? The logical operators (at least) seem to remain un-vectorized

Binary infix operator == with functional interpretation logical_eq requires arguments of primitive type (int or real), found left type=int[ ], right arg type=int.

Thanks,

bob-carpenter commented 1 year ago

@jessexknight Probably better to start a new issue specifically or logical operations, because it's not immediately obvious how they should behave. This issue was specifically for real-valued operations and used a specific vectorization assuming real valued functions with autodiff---here there are no derivatives because we get integer values out.

I can imagine two alternatives for the logical operators:

  1. int logical_eq(reals x, reals y); which returns a single truth value conjoining the element wise result.

  2. int[] logical_eq(reals x, reals y); which returns a container of results element wise.

Both approaches can broadcast scalars to containers. If we go with (2), then we probably want functions all() and any() like in NumPy that return the conjunction and disjunction of a container.

This is related to the issue of comparing containers to elements to return booleans, the way that R allows. That'd give us something like this

int[] xs = { 1, 0, 2, 3, 0, -1};
int[] ys = (xs == 0);   // after eval, ys = { 2, 5 }
jessexknight commented 1 year ago

Thanks, I've evidently added an issue...

Just want to add that I would strongly prefer (2) to allow more flexibility.

Not sure I follow the last issue you mentioned -- I'm not familiar with this behaviour in R, and personally wouldn't find this a priority vs the vectorization of the base logical functions.

bob-carpenter commented 1 year ago

Thanks for opening the issue. I edited it slightly to make it easier for our devs.

In R, you can do this:

> sex = c(0, 1, 1, 0, 0, 0, 1)
> age = c(23, 29, 30, 12, 15, 18)
> age[sex == 0]
[1] 23 12 15 18
> age[sex == 1]
[1] 29 30 NA
> age = c(23, 29, 30, 22, 25, 18, 31)
> sex = c(0,   1,  1 , 0,  0 , 0,  1)

> sex == 0
[1]  TRUE FALSE FALSE  TRUE  TRUE  TRUE FALSE

> age[sex == 0]
[1] 23 22 25 18

As you can see, it's useful for picking subgroups out of parallel sequences. But it relies on really odd behavior where if you give R a list of boolean arguments, it'll include the ones that are TRUE in the result. This doesn't make much sense, as TRUE evaluates to 1 and FALSE evaluates to 0, but you get very different results if you replace the booleans with integers here.

> age[c(1, 0, 0, 1, 1, 1, 0)]
[1] 23 23 23 23

This is a very R result in that it seems to just ignore the out of range 0 inputs!

jessexknight commented 1 year ago

Oh, I see. I think this logical indexing is available in Numpy and Matlab too. In fact, I think logical indexing can be faster in Numpy, besides the fact that 0 is not out of bounds ;)

I suppose this raises the idea of a logical data type in Stan, but I think this type of indexing might be one of the only use cases ...

bob-carpenter commented 1 year ago

logical indexing can be faster in Numpy

It's always going to be bound by having to evaluate the condition for every element of the container. You could potentially do it without constructing the intermediate sex == 0 array by evaluating it lazily with an expression template, which would be more efficient.

I suppose this raises the idea of a logical data type in Stan

We've already committed to the C-style coding of 0 for false and everything else being true, so if we did go down the boolean route, we'd at least need the type to be promotable to int.

andrjohns commented 1 year ago

As a bit of background for this: