suspicious results from neg_binomial_2_log_glm_lpmf adjoint

madeleineth commented 6 years ago

Summary:

neg_binomial_2_log_glm_lpmf has suspicious behavior in its partial derivative with respect to alpha.

Description:

I am seeing inconsistent results in the adjoint of alpha when computed with neg_binomial_2_log_lpmf and neg_binomial_2_log_glm_lpmf. The adjoint computed with neg_binomial_2_log_lpmf matches the numeric derivative (computed in unit tests of a different system), so I suspect an issue with neg_binomial_2_log_glm_lpmf. (I may also just be confused; this is my first time descending into the stan math code.)

Reproducible Steps:

$ clang++ --version
clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04 LTS
Release:        18.04
Codename:       bionic

$ make -j2 ./test/unit/math/rev/mat/prob/neg_binomial_2_log_glm_lpmf_test && 
    ./test/unit/math/rev/mat/prob/neg_binomial_2_log_glm_lpmf_test 
...
[  PASSED  ] 2 tests.

# Now comment out line 72 in neg_binomial_2_log_glm_lpmf_test.cpp, the
# call to `recover_memory` in glm_matches_neg_binomial_2_log_vars.

$ make -j2 ./test/unit/math/rev/mat/prob/neg_binomial_2_log_glm_lpmf_test && 
    ./test/unit/math/rev/mat/prob/neg_binomial_2_log_glm_lpmf_test 
[==========] Running 2 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 2 tests from ProbDistributionsNegBinomial2LogGLM
[ RUN      ] ProbDistributionsNegBinomial2LogGLM.glm_matches_neg_binomial_2_log_doubles
[       OK ] ProbDistributionsNegBinomial2LogGLM.glm_matches_neg_binomial_2_log_doubles (1 ms)
[ RUN      ] ProbDistributionsNegBinomial2LogGLM.glm_matches_neg_binomial_2_log_vars
test/unit/math/rev/mat/prob/neg_binomial_2_log_glm_lpmf_test.cpp:90: Failure
Value of: beta2[i].adj()
  Actual: 61
Expected: beta[i].adj()
Which is: 305
test/unit/math/rev/mat/prob/neg_binomial_2_log_glm_lpmf_test.cpp:90: Failure
Value of: beta2[i].adj()
  Actual: -121.4
Expected: beta[i].adj()
Which is: -607
test/unit/math/rev/mat/prob/neg_binomial_2_log_glm_lpmf_test.cpp:92: Failure
Value of: alpha2.adj()
  Actual: -2
Expected: alpha.adj()
Which is: -9.6000004
...lots more...

# Or, leave the line as it was, but add `double alpha_adj = alpha.adj();`
# between `lp.grad()` and `recover_memory`, then add
# `EXPECT_FLOAT_EQ(alpha_adj, alpha2.adj());` anywhere after `lp2.grad()`.

$ make -j2 ./test/unit/math/rev/mat/prob/neg_binomial_2_log_glm_lpmf_test && 
    ./test/unit/math/rev/mat/prob/neg_binomial_2_log_glm_lpmf_test 
[==========] Running 2 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 2 tests from ProbDistributionsNegBinomial2LogGLM
[ RUN      ] ProbDistributionsNegBinomial2LogGLM.glm_matches_neg_binomial_2_log_doubles
[       OK ] ProbDistributionsNegBinomial2LogGLM.glm_matches_neg_binomial_2_log_doubles (0 ms)
[ RUN      ] ProbDistributionsNegBinomial2LogGLM.glm_matches_neg_binomial_2_log_vars
test/unit/math/rev/mat/prob/neg_binomial_2_log_glm_lpmf_test.cpp:93: Failure
Value of: alpha2.adj()
  Actual: -2
Expected: alpha_adj
Which is: -3.2
[  FAILED  ] ProbDistributionsNegBinomial2LogGLM.glm_matches_neg_binomial_2_log_vars (0 ms)
[----------] 2 tests from ProbDistributionsNegBinomial2LogGLM (0 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test case ran. (1 ms total)
[  PASSED  ] 1 test.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] ProbDistributionsNegBinomial2LogGLM.glm_matches_neg_binomial_2_log_vars

I've repeated the alpha_adj check on macOS 10.13.4 (Apple LLVM version 9.1.0 (clang-902.0.39.1)) with the same results.

Additional Information:

My guess is that there is a bug in the neg_binomial_2_log_glm_lpmf gradient code. The glm_matches_neg_binomial_2_log_vars test doesn't find it because it calls recover_memory, then references alpha, etc. (which are invalidated by recover_memory) afterwards in such a way that the adjoints from the neg_binomial_2_log_glm_lpmf are getting compared to themselves, so all the tests pass. Indeed, this patch:

--- a/test/unit/math/rev/mat/prob/neg_binomial_2_log_glm_lpmf_test.cpp
+++ b/test/unit/math/rev/mat/prob/neg_binomial_2_log_glm_lpmf_test.cpp
@@ -84,6 +84,8 @@ TEST(ProbDistributionsNegBinomial2LogGLM, glm_matches_neg_binomial_2_log_vars) {
   var lp2
       = stan::math::neg_binomial_2_log_glm_lpmf(n2, x2, beta2, alpha2, phi2);
   lp2.grad();
+  std::cerr << "alpha.vi_ = " << reinterpret_cast<long>(alpha.vi_)
+            << " alpha2.vi_ = " << reinterpret_cast<long>(alpha2.vi_) << "\n";

   EXPECT_FLOAT_EQ(lp.val(), lp2.val());
   for (size_t i = 0; i < 2; i++) {

leads to this output:

alpha.vi_ = 26002208 alpha2.vi_ = 26002208

Note that alpha and alpha2 are completely independent. Stan is using the arena allocator, so ASan wouldn't catch the use-after-free.

I haven't looked hard enough at neg_binomial_2_log_glm_lpmf.hpp to find the real bug (if any), though.

Current Version:

develop HEAD (cefeb9843181354148ddc74b68c2644533bcfe2e)

bob-carpenter commented 6 years ago

Thanks for reporting. This is still a work-in-progress, so @VMatthijs will have to report back on this one. I hadn't even realized he was tackling a neg-binomial GLM.

VMatthijs commented 6 years ago

Thanks for spotting this, @madeleineth ! This is fixed by PR https://github.com/stan-dev/math/pull/687 once that gets merged. Please note that your first suggested way changing the tests does not work. The second suggested way is correct, however.

andrjohns commented 5 years ago

Closed via #687

stan-dev / math