flatironinstitute / bayes-kit

Bayesian inference and posterior analysis for Python
MIT License
41 stars 3 forks source link

HMC Tests Are Failing #35

Closed gil2rok closed 1 year ago

gil2rok commented 1 year ago

Overview

Most of the tests in test_hmc.py fail, likely as a result of the recent refactoring of hmc.py. To reproduce, simply call the functions in test_hmc.py from the base directory of the bayes-kit repo.

I noticed that there are two distinct errors causing these tests to fail.

List Comprehension Error

The first issue occurs when using list comprehension to sample from the hmc object:

draws = np.array([hmc.sample()[0] for _ in range(M)])

This line of code is used throughout test_hmc.py and is supposed to create an array of HMC samples. Instead, this code creates an array containing the last hmc.sample() element repeated M times. My hunch is that this occurs because of late-binding in Python lists, as explained here.

To resolve this issue on my machine, I have simply removed the list comprehension:

draws = np.zeros(M)
for i in range(M):
    draws[i] = hmc.sample()[0]

Statistical Error

After fixing the list comprehension error, the test_hmc_diag_rep() and test_hmc_binom() tests still fail. These errors are harder to diagnose and may arise from incorrect HMC implementation, improper HMC parameters, etc.

WardBrian commented 1 year ago

Instead, this code creates an array containing the last hmc.sample() element repeated M times.

This does not seem to be the case, at least not for me locally. If I print out draws, the elements are certainly not all the same. Can you share a bit more about your Python environment and how you observed this behavior?

As for statistical error, our tests do currently leave a lot to be desired in terms of their propensity to feature randomized failures. However, they should not be failing consistently - locally, the HMC tests fail for me quite rarely, less than 1 out of 10 runs. Some discussion of how to improve this further is in #27 #31

gil2rok commented 1 year ago

Environment: I'm using Python 3.11.3 installed in my local conda environment. I ran the tests from test_hmc.py from a Jupyter notebook located in the root of the bayes-kit repository. I imported the functions containing the tests with relative import statements.

Detecting this Bug: The test_hmc_diag_std_normal() test fails via the numpy assert statements. I traced down the bug to draws repeating the last sample from hmc.sample(). This was confirmed by printing out draws and also printing out the result of np.unique(draws).

Statistical Tests: Perhaps something is wrong with my local machine as these tests fail consistently for me.

WardBrian commented 1 year ago

@gil2rok and I debugged this in person, it was caused by an aliasing issue in his local edits to HMC.

Note for the future: x = x + y and x += y have different behavior for numpy arrays! The first creates a fresh allocation, while the second re-uses xs memory. If you're not careful, this can cause aliasing bugs (and also can mutate the input to your functions when you didn't intend!)

bob-carpenter commented 1 year ago

x = x + y and x += y have different behavior for numpy arrays!

Bad Python! Thanks for pointing this out.

In C++, this is considered Very Bad Form. Standard practice is to define the binary operator + in terms of += (not the other way around) to make sure they match!

Here's the rules for built-ins: https://en.cppreference.com/w/cpp/language/operator_assignment:

The behavior of every builtin compound-assignment expression E1 op= E2 (where E1 is a modifiable lvalue expression and E2 is an rvalue expression or a braced-init-list (since C++11)) is exactly the same as the behavior of the expression E1 = E1 op E2, except that the expression E1 is evaluated only once and that it behaves as a single operation with respect to indeterminately-sequenced function calls (e.g. in f(a += b, g()), the += is either not started at all or is completed as seen from inside g()).

WardBrian commented 1 year ago

Yeah, it definitely seems like a pretty big foot-gun in a core part of the syntax for numpy. numpy defines x + y to be equivalent to np.add(x,y, out=None) (where out=None is the usual "please allocate a new place for the result") and x += y as np.add(x,y,out=x).

I think this is pretty underdocumented, but the un-searchability of += makes it hard to say for certain there isn't a page warning you about this. The "quickstart" page says:

Some operations, such as += and *=, act in place to modify an existing array rather than create a new one.

but it doesn't describe what the consequences of this are in practice.