Closed gil2rok closed 1 year ago
Instead, this code creates an array containing the last
hmc.sample()
element repeatedM
times.
This does not seem to be the case, at least not for me locally. If I print out draws
, the elements are certainly not all the same. Can you share a bit more about your Python environment and how you observed this behavior?
As for statistical error, our tests do currently leave a lot to be desired in terms of their propensity to feature randomized failures. However, they should not be failing consistently - locally, the HMC tests fail for me quite rarely, less than 1 out of 10 runs. Some discussion of how to improve this further is in #27 #31
Environment: I'm using Python 3.11.3 installed in my local conda environment. I ran the tests from test_hmc.py
from a Jupyter notebook located in the root of the bayes-kit repository. I imported the functions containing the tests with relative import statements.
Detecting this Bug: The test_hmc_diag_std_normal()
test fails via the numpy assert statements. I traced down the bug to draws
repeating the last sample from hmc.sample()
. This was confirmed by printing out draws
and also printing out the result of np.unique(draws)
.
Statistical Tests: Perhaps something is wrong with my local machine as these tests fail consistently for me.
@gil2rok and I debugged this in person, it was caused by an aliasing issue in his local edits to HMC.
Note for the future: x = x + y
and x += y
have different behavior for numpy arrays! The first creates a fresh allocation, while the second re-uses x
s memory. If you're not careful, this can cause aliasing bugs (and also can mutate the input to your functions when you didn't intend!)
x = x + y
andx += y
have different behavior for numpy arrays!
Bad Python! Thanks for pointing this out.
In C++, this is considered Very Bad Form. Standard practice is to define the binary operator +
in terms of +=
(not the other way around) to make sure they match!
Here's the rules for built-ins: https://en.cppreference.com/w/cpp/language/operator_assignment:
The behavior of every builtin compound-assignment expression E1 op= E2 (where E1 is a modifiable lvalue expression and E2 is an rvalue expression or a braced-init-list (since C++11)) is exactly the same as the behavior of the expression E1 = E1 op E2, except that the expression E1 is evaluated only once and that it behaves as a single operation with respect to indeterminately-sequenced function calls (e.g. in f(a += b, g()), the += is either not started at all or is completed as seen from inside g()).
Yeah, it definitely seems like a pretty big foot-gun in a core part of the syntax for numpy.
numpy defines x + y
to be equivalent to np.add(x,y, out=None)
(where out=None
is the usual "please allocate a new place for the result") and x += y
as np.add(x,y,out=x)
.
I think this is pretty underdocumented, but the un-searchability of +=
makes it hard to say for certain there isn't a page warning you about this. The "quickstart" page says:
Some operations, such as
+=
and*=
, act in place to modify an existing array rather than create a new one.
but it doesn't describe what the consequences of this are in practice.
Overview
Most of the tests in
test_hmc.py
fail, likely as a result of the recent refactoring ofhmc.py
. To reproduce, simply call the functions intest_hmc.py
from the base directory of the bayes-kit repo.I noticed that there are two distinct errors causing these tests to fail.
List Comprehension Error
The first issue occurs when using list comprehension to sample from the
hmc
object:draws = np.array([hmc.sample()[0] for _ in range(M)])
This line of code is used throughout
test_hmc.py
and is supposed to create an array of HMC samples. Instead, this code creates an array containing the lasthmc.sample()
element repeatedM
times. My hunch is that this occurs because of late-binding in Python lists, as explained here.To resolve this issue on my machine, I have simply removed the list comprehension:
Statistical Error
After fixing the list comprehension error, the
test_hmc_diag_rep()
andtest_hmc_binom()
tests still fail. These errors are harder to diagnose and may arise from incorrect HMC implementation, improper HMC parameters, etc.