ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.18k stars 1.19k forks source link

Benchmark performance regression: mercedes_benz_greener.ecd.yaml #2978

Closed tgaddair closed 3 weeks ago

tgaddair commented 1 year ago
FAILED tests/regression_tests/benchmark/test_model_performance.py::test_performance[mercedes_benz_greener.ecd.yaml] - AssertionError: The obtained r2 value (0.32296961545944214) was not within 15.0% of the expected value (0.47405338287353516).
assert 0.15108376741409302 <= 0.07110800743103027
 +  where 0.15108376741409302 = abs((0.47405338287353516 - 0.32296961545944214))
 +    where 0.47405338287353516 = ExpectedMetric(output_feature_name='y', metric_name='r2', expected_value=0.47405338287353516, tolerance_percentage=0.15).expected_value

https://github.com/ludwig-ai/ludwig/actions/runs/3970007361/jobs/6805218517

abidwael commented 1 year ago

First PR that caused this regression is https://github.com/ludwig-ai/ludwig/pull/2890. Will take a deeper look.

abidwael commented 1 year ago

@tgaddair I looked at the PR after which tests started failing (https://github.com/ludwig-ai/ludwig/pull/2890) and looked for something that could affect training and cause the regression but didn't find anything. I get the sense that running these on merge to master didn't help us catch these regressions, so I suggest that we move this tests to be run on push to branch. They only take 1m39s (excluding install) so they won't be adding much anyways.

PR with benchmark test fixes: https://github.com/ludwig-ai/ludwig/pull/3115