The test test_tcn_adding is flaky. This PR address this issue.
To find a solution, I collected samples of training_loss from several test executions and computed the tail distribution. I computed the extreme percentiles to check how high can the values be.
0.99: 8e-3
0.999: 3.6e-3
0.9999: 7e-2
Based on this, I updated the fix to 1e-2. I think setting the bound using the statistical evaluation might be a good way to ensure the test is not flaky.
Do you guys think this makes sense? Please let me know if this looks good or if you have any other suggestions. Also, here I assume there are no bugs in the code under test.
The test
test_tcn_adding
is flaky. This PR address this issue.To find a solution, I collected samples of
training_loss
from several test executions and computed the tail distribution. I computed the extreme percentiles to check how high can the values be.Based on this, I updated the fix to
1e-2
. I think setting the bound using the statistical evaluation might be a good way to ensure the test is not flaky.Do you guys think this makes sense? Please let me know if this looks good or if you have any other suggestions. Also, here I assume there are no bugs in the code under test.