Open zaqqwerty opened 3 years ago
Another way is just to use a fixed seed so that the sampling error is deterministic. I think I see why you didn’t want to do this, but I worry a bit about potential test flakiness ie we are allowing some (perhaps arbitrarily small) probability of presubmits breaking when we allow for non-determinism. Hence, fixing a seed is common in TFP and elsewhere
Some tests have tolerances that are fairly loose - for example #84 has a 3% relative tolerance. The tolerance can be tightened if we increase the number of samples used for the calculation, as expected. But, currently 1e6 samples are required to get to this 3% tolerance, and 1e7 to get 1%, which seems excessive. Resolving this issue would involve finding any sources of numerical imprecision that we can better control to decrease the number of samples required to enforce a 1% relative tolerance everywhere.