test nan input for error handling in stan/math/error_handling

stan-dev / stan

Stan development repository. The master branch contains the current release. The develop branch contains the latest stable development. See the Developer Process Wiki for details.

https://mc-stan.org

BSD 3-Clause "New" or "Revised" License

2.59k stars 369 forks source link

test nan input for error handling in stan/math/error_handling #891

Closed PeterLi2016 closed 10 years ago

PeterLi2016 commented 10 years ago

Add tests for nan input.

[x] #909: check_bounded.hpp
[x] #910: check_consistent_size.hpp
[x] #911: check_consistent_sizes.hpp
[x] #912: check_equal.hpp
[x] #913: check_finite.hpp
[x] #917: check_greater.hpp
[x] #914: check_greater_or_equal.hpp
[x] #915: check_less.hpp
[x] #916: check_less_or_equal.hpp
[x] #918: check_nonnegative.hpp
[x] check_not_nan.hpp: no new tests needed
[x] #919: check_positive.hpp
[x] #932: check_positive_finite.hpp
[x] #929: check_cholesky_factor.hpp
[x] #935: check_cholesky_factor_corr.hpp: depends on check_unit_vec
[x] #920: check_column_index.hpp
[x] #906: check_corr_matrix.hpp: fix fabs(y(k,k) - 1.0) > CONSTRAINT_TOLERANCE and depends on check_pos_def
[x] #936: check_cov_matrix.hpp: depends on check_pos_def and check_sym
[x] #930: check_ldlt_factor.hpp: only fails if nan is in lower triangular portion of matrix
[x] #928: check_lower_triangular.hpp
[x] #921: check_matching_dims.hpp
[x] #922: check_matching_sizes.hpp
[x] #923: check_multiplicable.hpp
[x] #924: check_nonzero_size.hpp
[x] #933: check_ordered.hpp
[x] #901: check_pos_definite.hpp
[x] #908: check_pos_semidefinite.hpp
[x] #931: check_positive_ordered.hpp
[x] check_range.hpp: no test needs to be added since it takes args size_t
[x] #925: check_row_index.hpp
[x] #897: check_simplex.hpp
[x] check_size_match.hpp: no test needs to be added since it takes args size_t
[x] #937: check_spsd_matrix.hpp: depends on check_sym and check_pos_semidef
[x] #934: check_square.hpp
[x] #926: check_std_vector_index.hpp
[x] #905: check_symmetric.hpp: fix fabs(y(m,n) - y(n,m)) > CONSTRAINT_TOLERANCE but should this return true if the matrix is 1x1 with element (0,0) being nan? what if it's a 2x2 and symmetric in that (0,1)=(1,0)=nan?
[x] #899: check_unit_vector.hpp
[x] #927: check_vector.hpp
[x] constraint_tolerance.hpp: no test needed
[x] validate_non_negative_index.hpp: no test needs to be added since it takes arg int

After all of the above pull requests have been merged, check the branch https://github.com/stan-dev/stan/tree/feature/issue-891-error-checks-nan and make sure everything here was merged in -- the doc was partially redone since a lot of it was done all in one commit.

[x] delete this branch when done

PeterLi2016 commented 10 years ago

If nan is passed into a check, should the check fail? If so, something like this

    template <typename T_y, typename T_result>
    bool check_ordered(const char* function,
                       const Eigen::Matrix<T_y,Eigen::Dynamic,1>& y,
                       const char* name,
                       T_result* result) {
      typedef typename Eigen::Matrix<T_y,Eigen::Dynamic,1>::size_type size_t;
      if (y.size() == 0) {
        return true;
      }
      for (size_t n = 1; n < y.size(); n++) {
        if (!(y[n] > y[n-1])) {
          std::ostringstream stream;
          stream << " is not a valid ordered vector."
                 << " The element at " << stan::error_index::value + n 
                 << " is %1%, but should be greater than the previous element, "
                 << y[n-1];
          std::string msg(stream.str());
          return dom_err(function,y[n],name,
                         msg.c_str(),"",
                         result);
          return false;
        }
      }
      return true;
    }

will work since the greater than operator returns false if either argument is nan.

However, something like

  template <typename T_y, typename T_result>
    inline bool check_symmetric(const char* function,
                                const Eigen::Matrix<T_y,Eigen::Dynamic,Eigen::Dynamic>& y,
                                const char* name,
                                T_result* result) {
      typedef 
        typename Eigen::Matrix<T_y,Eigen::Dynamic,Eigen::Dynamic>::size_type 
        size_type;
      size_type k = y.rows();
      if (k == 1)
        return true;
      for (size_type m = 0; m < k; ++m) {
        for (size_type n = m + 1; n < k; ++n) {
          if (fabs(y(m,n) - y(n,m)) > CONSTRAINT_TOLERANCE) {
            std::ostringstream message;
            message << name << " is not symmetric. " 
                    << name << "[" << stan::error_index::value + m << "," 
                    << stan::error_index::value +n << "] is %1%, but "
                    << name << "[" << stan::error_index::value +n << "," 
                    << stan::error_index::value + m 
                    << "] element is " << y(n,m);
            std::string msg(message.str());
            return dom_err(function,y(m,n),name,
                           msg.c_str(),"",
                           result);
          }
        }
      }
      return true;
    }

will not work, again because the greater than operator returns false if either argument is nan.

syclik commented 10 years ago

If I remember correctly, someone consciously removed the nan checks from the rest of the checks. So we'll have to check nan separately from everything else, I think.

On Wed, Aug 20, 2014 at 7:32 PM, Peter Li notifications@github.com wrote:

If nan is passed into a check, should the check fail? If so, something like this
template <typename T_y, typename T_result>
bool check_ordered(const char* function,
                   const Eigen::Matrix<T_y,Eigen::Dynamic,1>& y,
                   const char* name,
                   T_result* result) {
  typedef typename Eigen::Matrix<T_y,Eigen::Dynamic,1>::size_type size_t;
  if (y.size() == 0) {
    return true;
  }
  for (size_t n = 1; n < y.size(); n++) {
    if (!(y[n] > y[n-1])) {
      std::ostringstream stream;
      stream << " is not a valid ordered vector."
             << " The element at " << stan::error_index::value + n
             << " is %1%, but should be greater than the previous element, "
             << y[n-1];
      std::string msg(stream.str());
      return dom_err(function,y[n],name,
                     msg.c_str(),"",
                     result);
      return false;
    }
  }
  return true;
}
will work since the greater than operator returns false if either argument is nan.

However, something like

template <typename T_y, typename T_result> inline bool check_symmetric(const char* function, const Eigen::Matrix<T_y,Eigen::Dynamic,Eigen::Dynamic>& y, const char* name, T_result* result) { typedef typename Eigen::Matrix<T_y,Eigen::Dynamic,Eigen::Dynamic>::size_type size_type; size_type k = y.rows(); if (k == 1) return true; for (size_type m = 0; m < k; ++m) { for (size_type n = m + 1; n < k; ++n) { if (fabs(y(m,n) - y(n,m)) > CONSTRAINT_TOLERANCE) { std::ostringstream message; message << name << " is not symmetric. " << name << "[" << stan::error_index::value + m << "," << stan::error_index::value +n << "] is %1%, but " << name << "[" << stan::error_index::value +n << "," << stan::error_index::value + m << "] element is " << y(n,m); std::string msg(message.str()); return dom_err(function,y(m,n),name, msg.c_str(),"", result); } } } return true; }

will not work, again because the greater than operator returns false if either argument is nan.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/891#issuecomment-52859901.

PeterLi2016 commented 10 years ago

So should all checks have check_nan() first? Some of the checks look like they'd still work with nans.

On Wed, Aug 20, 2014 at 9:29 PM, Daniel Lee notifications@github.com wrote:

If I remember correctly, someone consciously removed the nan checks from the rest of the checks. So we'll have to check nan separately from everything else, I think.

On Wed, Aug 20, 2014 at 7:32 PM, Peter Li notifications@github.com wrote:

If nan is passed into a check, should the check fail? If so, something like this

template <typename T_y, typename T_result> bool check_ordered(const char* function, const Eigen::Matrix<T_y,Eigen::Dynamic,1>& y, const char* name, T_result* result) { typedef typename Eigen::Matrix<T_y,Eigen::Dynamic,1>::size_type size_t; if (y.size() == 0) { return true; } for (size_t n = 1; n < y.size(); n++) { if (!(y[n] > y[n-1])) { std::ostringstream stream; stream << " is not a valid ordered vector." << " The element at " << stan::error_index::value + n << " is %1%, but should be greater than the previous element, " << y[n-1]; std::string msg(stream.str()); return dom_err(function,y[n],name, msg.c_str(),"", result); return false; } } return true; }

will work since the greater than operator returns false if either argument is nan.

However, something like

template <typename T_y, typename T_result> inline bool check_symmetric(const char* function, const Eigen::Matrix<T_y,Eigen::Dynamic,Eigen::Dynamic>& y, const char* name, T_result* result) { typedef typename Eigen::Matrix<T_y,Eigen::Dynamic,Eigen::Dynamic>::size_type size_type; size_type k = y.rows(); if (k == 1) return true; for (size_type m = 0; m < k; ++m) { for (size_type n = m + 1; n < k; ++n) { if (fabs(y(m,n) - y(n,m)) > CONSTRAINT_TOLERANCE) { std::ostringstream message; message << name << " is not symmetric. " << name << "[" << stan::error_index::value + m << "," << stan::error_index::value +n << "] is %1%, but " << name << "[" << stan::error_index::value +n << "," << stan::error_index::value + m << "] element is " << y(n,m); std::string msg(message.str()); return dom_err(function,y(m,n),name, msg.c_str(),"", result); } } } return true; }

will not work, again because the greater than operator returns false if either argument is nan.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/891#issuecomment-52859901.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/891#issuecomment-52868189.

syclik commented 10 years ago

My guess is that we haven't been consistent and should be.

On Wed, Aug 20, 2014 at 9:32 PM, Peter Li notifications@github.com wrote:

So should all checks have check_nan() first? Some of the checks look like they'd still work with nans.

On Wed, Aug 20, 2014 at 9:29 PM, Daniel Lee notifications@github.com wrote:

If I remember correctly, someone consciously removed the nan checks from the rest of the checks. So we'll have to check nan separately from everything else, I think.

On Wed, Aug 20, 2014 at 7:32 PM, Peter Li notifications@github.com wrote:

If nan is passed into a check, should the check fail? If so, something like this

template <typename T_y, typename T_result> bool check_ordered(const char* function, const Eigen::Matrix<T_y,Eigen::Dynamic,1>& y, const char* name, T_result* result) { typedef typename Eigen::Matrix<T_y,Eigen::Dynamic,1>::size_type size_t; if (y.size() == 0) { return true; } for (size_t n = 1; n < y.size(); n++) { if (!(y[n] > y[n-1])) { std::ostringstream stream; stream << " is not a valid ordered vector." << " The element at " << stan::error_index::value + n << " is %1%, but should be greater than the previous element, " << y[n-1]; std::string msg(stream.str()); return dom_err(function,y[n],name, msg.c_str(),"", result); return false; } } return true; }

will work since the greater than operator returns false if either argument is nan.

However, something like

template <typename T_y, typename T_result> inline bool check_symmetric(const char* function, const Eigen::Matrix<T_y,Eigen::Dynamic,Eigen::Dynamic>& y, const char* name, T_result* result) { typedef typename Eigen::Matrix<T_y,Eigen::Dynamic,Eigen::Dynamic>::size_type size_type; size_type k = y.rows(); if (k == 1) return true; for (size_type m = 0; m < k; ++m) { for (size_type n = m + 1; n < k; ++n) { if (fabs(y(m,n) - y(n,m)) > CONSTRAINT_TOLERANCE) { std::ostringstream message; message << name << " is not symmetric. " << name << "[" << stan::error_index::value + m << "," << stan::error_index::value +n << "] is %1%, but " << name << "[" << stan::error_index::value +n << "," << stan::error_index::value + m << "] element is " << y(n,m); std::string msg(message.str()); return dom_err(function,y(m,n),name, msg.c_str(),"", result); } } } return true; }

will not work, again because the greater than operator returns false if either argument is nan.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/891#issuecomment-52859901.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/891#issuecomment-52868189.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/891#issuecomment-52868359.

bob-carpenter commented 10 years ago

On Aug 20, 2014, at 9:33 PM, Daniel Lee notifications@github.com wrote:

My guess is that we haven't been consistent and should be.

The important thing is that we be consistent across double and var behavior and document what that behavior is. They should have return values and throw the same exceptions.

If we want any of the standard lib C++ functions to throw exceptions, we'll need to override them. We decided we could move in that direction, but it wasn't critical.

Michael answered the question about NaN in derivatives --- if the value of the function is NaN, all partials should also be NaN.

syclik commented 10 years ago

Sorry, my comment was with respect to our check functions. I'm sure some of them check for nan while others don't. It'd be great if they were consistent, as in all of them do not check for nan explicitly except for check_not_nan, or all of them checks for nan where appropriate.

We're all agreed on the behavior of our functions. The most important thing is consistency between our stan::math functions and stan::agrad functions.

NaN for partials when the value of the function is NaN sounds good.

On Wed, Aug 20, 2014 at 10:11 PM, Bob Carpenter notifications@github.com wrote:

On Aug 20, 2014, at 9:33 PM, Daniel Lee notifications@github.com wrote:

My guess is that we haven't been consistent and should be.

The important thing is that we be consistent across double and var behavior and document what that behavior is. They should have return values and throw the same exceptions.

If we want any of the standard lib C++ functions to throw exceptions, we'll need to override them. We decided we could move in that direction, but it wasn't critical.

Michael answered the question about NaN in derivatives --- if the value of the function is NaN, all partials should also be NaN.

Bob

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/891#issuecomment-52870514.

bob-carpenter commented 10 years ago

I think it'll depend on the test. Tests for container size shouldn't check for NaN contents. But anything checking for finiteness or greater than or less than a bounds, or positive-definiteness or whatever should be checking there aren't any NaN values.

On Aug 20, 2014, at 10:16 PM, Daniel Lee notifications@github.com wrote:

Sorry, my comment was with respect to our check functions. I'm sure some of them check for nan while others don't. It'd be great if they were consistent, as in all of them do not check for nan explicitly except for check_not_nan, or all of them checks for nan where appropriate.

We're all agreed on the behavior of our functions. The most important thing is consistency between our stan::math functions and stan::agrad functions.

NaN for partials when the value of the function is NaN sounds good.

On Wed, Aug 20, 2014 at 10:11 PM, Bob Carpenter notifications@github.com wrote:

On Aug 20, 2014, at 9:33 PM, Daniel Lee notifications@github.com wrote:

My guess is that we haven't been consistent and should be.

The important thing is that we be consistent across double and var behavior and document what that behavior is. They should have return values and throw the same exceptions.

If we want any of the standard lib C++ functions to throw exceptions, we'll need to override them. We decided we could move in that direction, but it wasn't critical.

Michael answered the question about NaN in derivatives --- if the value of the function is NaN, all partials should also be NaN.

Bob

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/891#issuecomment-52870514.

— Reply to this email directly or view it on GitHub.

PeterLi2016 commented 10 years ago

Ok, so for check_positive_ordered and check_ordered, should these fail if there is a nan in the middle of the vector?

And then for check_symmetric, if it's a 1x1 matrix where the only element is nan, should this throw or not? And what if it's a 2x2 matrix with both off diagonal elements being nan?

bob-carpenter commented 10 years ago

[changed subject]

I think check_positive_ordered should definitely fail because NaN isn't positive.

I think check_ordered should fail, too, which makes me think our sort function should just fail with NaN inputs. I think they should match --- anything that comes out of sort without error should satisfy check_ordered.

As to check_symmetric, that could go either way. Wherever it's used, we probably want to also do a finiteness check, which would rule out NaN and infinite values.

On Aug 21, 2014, at 2:11 PM, Peter Li notifications@github.com wrote:

Ok, so for check_positive_ordered and check_ordered, should these fail if there is a nan in the middle of the vector?

And then for check_symmetric, if it's a 1x1 matrix where the only element is nan, should this throw or not? And what if it's a 2x2 matrix with both off diagonal elements being nan?

— Reply to this email directly or view it on GitHub.