Test results are non-deterministic

ami-iit / adam

adam implements a collection of algorithms for calculating rigid-body dynamics in Jax, CasADi, PyTorch, and Numpy.

https://adam-docs.readthedocs.io/en/latest/

BSD 3-Clause "New" or "Revised" License

124 stars 19 forks source link

Test results are non-deterministic #42

Open traversaro opened 1 year ago

traversaro commented 1 year ago

For example, see this output of two tests runs on the same commit:

The reason for this is that we call np.random, but we do not set the seed, so the test results are different at every run (see https://adamj.eu/tech/2018/01/08/pytest-randomly-history/ and https://towardsdatascience.com/random-seeds-and-reproducibility-933da79446e3). The long term plan may be to implement some kind of way of controlling randomness (for example via https://github.com/pytest-dev/pytest-randomly), but in the short term perhaps the easy fix is to increase the test threshold.

DanielePucci commented 1 year ago

CC @ami-iit/artificial-mechanical-intelligence

Giulero commented 1 year ago

Thanks @traversaro! :) Agreed! I'll open a PR for increasing the tolerance, but I'll plan to address this issue in a more structured way. P.S. I do not really understand why Jax and PyTorch tests are the only ones failing. Is there some strange interaction between these two frameworks and numpy?

P.P.S. Maybe it's a stupid solution. What if I set np.random.seed(0)? I guess it will always give me the same sequence of random numbers.

traversaro commented 1 year ago

P.P.S. Maybe it's a stupid solution. What if I set np.random.seed(0)? I guess it will always give me the same sequence of random numbers.

That for sure should work fine. Reading https://adamj.eu/tech/2018/01/08/pytest-randomly-history/ and similar tests, it seems that people do not like it as you perturb the global state and so you could influence other tests, but for our specific case it should work fine (that is what we do in iDynTree, for example: https://github.com/robotology/idyntree/blob/35b0f76a9db3809384e8ebcbdb7cfb11d2cb7a7b/bindings/python/tests/joints.py#L31 and https://github.com/robotology/idyntree/blob/35b0f76a9db3809384e8ebcbdb7cfb11d2cb7a7b/src/estimation/tests/KalmanFilterUnitTest.cpp#L84).

P.S. I do not really understand why Jax and PyTorch tests are the only ones failing. Is there some strange interaction between these two frameworks and numpy?

I guess that for some reason on some joint configuration the numeric error induced by how these frameworks make the computation is bigger, but it is just an intuition.

Giulero commented 1 year ago

That for sure should work fine. Reading https://adamj.eu/tech/2018/01/08/pytest-randomly-history/ and similar tests, it seems that people do not like it as you perturb the global state, and so you could influence other tests, but for our specific case it should work fine

So I could start with this approach and then proceed with a more refined solution.

I guess that for some reason on some joint configuration the numeric error induced by how these frameworks make the computation is bigger, but it is just an intuition.

I suspect the same. I did some tests and it seems that setting torch.set_default_dtype(torch.float64) and config.update("jax_enable_x64", True) (as in #39) along with np.random.seed(42) does the job, and tests don't fail.

Giulero commented 1 year ago

Just for a log, the proposed solution in https://github.com/ami-iit/ADAM/issues/42#issuecomment-1549505862 is implemented in #39.