dynamicslab / pysindy

A package for the sparse identification of nonlinear dynamical systems from data
https://pysindy.readthedocs.io/en/latest/
Other
1.42k stars 310 forks source link

[BUG] test_optimizers_complexity.py fails intermittently - should it? #390

Open Jacob-Stevens-Haas opened 1 year ago

Jacob-Stevens-Haas commented 1 year ago

The two tests, test_complexity and test_complexity_parameter have had intermittent failures that are usually cleared by invoking another CI run. Mostly asking @Ohjeah and @briandesilva since you added & edited these tests.

  1. test_complexity_parameter tests that increasing regularization decreases the number of nonzero coefficients. Makes sense for STLSQ and SR3, doesn't make sense for Ridge, but I'm not sure about Lasso. I would think that's guaranteed, but it fails intermittently nonetheless. It appears to work better with normalize_columns = True, as I think I remember Candes and Tao's paper included a requirement for nearly-orthogonal columns/low condition number. But am I wrong/would that mess with guarantees (if there are any)?
  2. Are people ok if I remove test_complexity? It seems to test that regularizer classes's default parameters follow a standard hierarchy of most-to-least regularized. AFAIK, it isn't testing any mathematical invariant, (e.g. that SR3 complexity is less than lasso). Moreover, it's sensitive to the default regularization parameters, and random data can violate the test anyway, motivating a fudge factor in the assert. Finally, aren't individual tests for each optimizer class to make sure they're mathematically correct be the right way of testing the classes?
  3. They were in the #320 list of tests that take too long to run, so I cut the random data size parameters down a bunch. They now run much faster, but occasionally give hypothesis.errors.FailedHealthCheck: Examples routinely exceeded the max allowable size. (20 examples overran while generating 9 valid ones) Not sure what causes this, I'll look it up.