aesara-devs / aemcmc

AeMCMC is a Python library that automates the construction of samplers for Aesara graphs representing statistical models.
https://aemcmc.readthedocs.io/en/latest/
MIT License
39 stars 11 forks source link

Fix flaky tests #32

Closed rlouf closed 2 years ago

rlouf commented 2 years ago

The tests for the distribution rely on a specific random number generator implementation and seed. These tests broke as the result of a change upstream in aesara 2.6.5 (I suspect the change happened here https://github.com/aesara-devs/aesara/pull/939). This should not happen and we thus need better tests for the distributions.

codecov[bot] commented 2 years ago

Codecov Report

Merging #32 (288e5cd) into main (0a8cc00) will decrease coverage by 5.40%. The diff coverage is n/a.

:exclamation: Current head 288e5cd differs from pull request most recent head e1af18e. Consider uploading reports for the commit e1af18e to get more accurate results

@@             Coverage Diff             @@
##              main      #32      +/-   ##
===========================================
- Coverage   100.00%   94.59%   -5.41%     
===========================================
  Files            2        2              
  Lines          185      185              
  Branches        11       11              
===========================================
- Hits           185      175      -10     
- Misses           0        9       +9     
- Partials         0        1       +1     
Impacted Files Coverage Δ
aemcmc/dists.py 79.59% <0.00%> (-20.41%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 0a8cc00...e1af18e. Read the comment docs.

zoj613 commented 2 years ago

If there are no random stream preservation guarantees by the aesara package, then it means we should not rely on the seed for any tests. We should think about how we can verify the output of those distributions without relying on the seed.

zoj613 commented 2 years ago

Which properties of the multivariate normal do you think we could test for in the unittest @rlouf ?

EDIT: I think we could test if the sample mean is close to the expected value of the distribution as described in the implementation? I think that would mean we increase the sample size in the unittest. Or even include the standard normality tests like the Shapiro test? scipy has an implementation at https://docs.scipy.org/doc/scipy-1.8.0/html-scipyorg/reference/generated/scipy.stats.shapiro.html. We could generate a large sample and test if the pvalue returned is less than a particular threshold (usually 0.05).

EDIT2: This might not work since the above test against a univariate normal distribution.

Cc @brandonwillard

rlouf commented 2 years ago

The first thing we want to make sure is that the output samples are of the correct shape. It is always hard to test for correctness of the output of random functions, but we can probably rely on some theoretical result to design a simple test, or use an existing test.

rlouf commented 2 years ago

I've added a test to check that the samples' shape is correct for each sampler. Should we merge this so we can rebase the other PRs and open a separate issue?

zoj613 commented 2 years ago

I've added a test to check that the samples' shape is correct for each sampler. Should we merge this so we can rebase the other PRs and open a separate issue?

I think we can merge these tests as is, but i'd keep the pytest decorators since they are still incomplete and to be revisited.

rlouf commented 2 years ago

@zoj613 Could you have a look at the PR? I added some simple moment-based tests but I am not sure about the formulas I used to compute the mean and variance from the parameters in the last two tests.

rlouf commented 2 years ago

I simplified the tests that were not passing; we can merge this so we can move forward on the other PRs and open an issue to add more tests on the distributions.

zoj613 commented 2 years ago

I simplified the tests that were not passing; we can merge this so we can move forward on the other PRs and open an issue to add more tests on the distributions.

I can't review your previous tests because of the force-push that got rid of the commits so i'll post here:

# line in function testing cong2017
var = 1.0 / (A + np.dot(phi.T, np.dot(omega, phi)))

From the reference, page 17, the precision matrix is not diagonal. Only A and Omega are (but not phi). var in your case is supposed to an actual matrix inverse via at.inv(A + np.dot(phi.T, np.dot(omega, phi))). Also with mean = var * np.dot(phi.T, omega) * t, we have to make sure we use a matrix-vector multiplication but it appears we used an elementwise product here (im not too sure).

I think we can merge these tests as is, but i'd keep the pytest decorators since they are still incomplete and to be revisited. I will take a look at them and see if I can come up with better tests.

rlouf commented 2 years ago

I opened #33 to keep track of this. Can someone approve and merge unless there is something else ?

zoj613 commented 2 years ago

@brandonwillard I think we can continue this in a separate PR. i've assigned that to myself. Merging is blocked due to your previosly requested change.