pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.63k stars 1.99k forks source link

Potential flaky test: truncation_discrete_random #6206

Closed OriolAbril closed 1 year ago

OriolAbril commented 1 year ago

Description of your problem

The test pymc/tests/distributions/test_truncated.py::test_truncation_discrete_random[rejection-2-inf] failed for #1246 which modifies sample_posterior_predictive only. I think it is (or has become) a flaky test as it did pass in previous commits but failed eventually when the only changes had been to test files to fix tests related to sample_posterior_predictive.

mattiadg commented 1 year ago

1246 is an issue from 2016, and tests #1246 was successful https://github.com/pymc-devs/pymc/actions/runs/3230416900.

Where can we find the error?

mattiadg commented 1 year ago

This looks different from the other 2 issues about Flaky tests #6210 and #6211. Here, it looks like the author of the test expected it to pass for any draws https://github.com/mattiadg/pymc/blob/main/pymc/tests/distributions/test_truncated.py#L167-L181 I am not sure if it's correct to always expect to get at least one value equal to the lower bound of the distribution and one equal to the upper bound, but it looks like a strange thing to test only for a specific seed, and it also passed almost every time so far. Maybe this can be a symptom of a small issue in the distribution?

Also, this test is not a SeededTest, which makes me think the idea was exactly that those tests must hold for all possible draws

ricardoV94 commented 1 year ago

From the first comment, the condition that fails was probably this one: https://github.com/mattiadg/pymc/blob/7503730dd20d4e7318b31a9834951aae647929d7/pymc/tests/distributions/test_truncated.py#L182-L184

The test intends to check that 3 draws are not sufficient to obtain 500 draws from a Geometric(0.2) that do not include 1 via rejection sampling. This might not have been enough though. I think the probability is supposed to be (1 - 0.2**3) ** 500 = 0.018. That is the probability of not getting 1 three times in a row in 500 independent draws. Anyway, it's probably better to just seed it, and mention the expected probability in case our seed changes in the future and we have to reassess whether it is still working or not.

mattiadg commented 1 year ago

And would you seed it by adding a seed in the function or by creating a new class that derives from Seeded Test?

Il lun 17 ott 2022, 11:12 Ricardo Vieira @.***> ha scritto:

From the first comment, the condition that fails was probably this one: https://github.com/mattiadg/pymc/blob/7503730dd20d4e7318b31a9834951aae647929d7/pymc/tests/distributions/test_truncated.py#L182-L184

The test intends to check that 3 draws are not sufficient to obtain 500 draws from a Geometric(0.2) that are above 2 via rejection sampling. This might not have been enough though. I think the probability is supposed to be (1 - 0.23) 500 = 0.018. That is the probability of not getting 1 three times in a row in 500 independent draws. Anyway, it's probably better to just seed it, and mention the expected probability in case our seed changes in the future and we have to reassess whether it is still working or not.

— Reply to this email directly, view it on GitHub https://github.com/pymc-devs/pymc/issues/6206#issuecomment-1280537867, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7LDIQXOH5O6E6VTHSYNW3WDUKBPANCNFSM6AAAAAARCWNP4I . You are receiving this because you commented.Message ID: @.***>

ricardoV94 commented 1 year ago

And would you seed it by adding a seed in the function or by creating a new class that derives from Seeded Test? Il lun 17 ott 2022, 11:12 Ricardo Vieira @.***> ha scritto:

Pass a seed to the draw function. The SeededTest is mostly a legacy thing when we used to rely on global seeding internally

ricardoV94 commented 1 year ago

Closed via #6229