Redo unit tests - Githubissues

Previously we have been writing regression tests. This has been very problematic because we have not been able to get reliable tests across operating systems.

For example, one test gave three different results on different OS:

ubuntu

                             [,1]       [,2]       [,3]        [,4]
model.0.weight[1, ]    0.35076502 -0.5510707 0.33145362  0.07187607
model.0.weight[2, ]    0.84151959  0.1904376 0.34637195 -0.41729349
model.0.weight[3, ]   -0.08732598  0.8749689 0.07960998 -0.36410567

macos

                            [,1]        [,2]      [,3]        [,4]
model.0.weight[1, ]   0.72598004 -0.81716537 0.3312859  0.05504321
model.0.weight[2, ]   1.31857872 -0.01325749 0.3686642 -0.47075731
model.0.weight[3, ]   0.56407475  0.78450465 0.2276767 -0.42647159

windows

                          [,1]        [,2]      [,3]        [,4]
model.0.weight[1, ] 0.42260757 -0.62561375 0.3315659  0.06827193
model.0.weight[2, ] 0.95905924  0.15253183 0.3473763 -0.42539096
model.0.weight[3, ] 0.03839011  0.85049051 0.1365590 -0.38458425

In this particular case, we think that it is due to different numeric precisions required in LBFGS that are different in different OS (for torch, at least; R is no issues in this way).

So instead of expecting the same results for the same code, we'll test to make sure that the models are learning from the data. This also prepares us for upcoming GPU support where the standard for reproducibility has no bottom.

The PR will revisit the entire test suite and convert the errors to use cli.

tidymodels / brulee

Redo unit tests #75