Investigate why modelgauge test runs fail

mlcommons / modelgauge

Make it easy to automatically and uniformly measure the behavior of many AI Systems.

https://mlcommons.org/ai-safety/

Apache License 2.0

26 stars 7 forks source link

Investigate why modelgauge test runs fail #506

Closed rogthefrog closed 3 months ago

rogthefrog commented 3 months ago

https://github.com/mlcommons/modelgauge/actions/runs/10341862847/job/28624140017

rogthefrog commented 3 months ago

There are two "fixes":

one clears storage so the test doesn't run out of space on the device
the other installs modelgauge as a package locally from the code that is checked out in the runner, rather than pypi, and tests whether all the tests can be loaded.

The first fix is fairly non-controversial :)

The second fix requires a conversation with @wpietri to clarify the goal of this particular test.

wpietri commented 3 months ago

I think back in the day this was to make sure that our released artifact continued to work. I suspect that's still a good thing to do, but let's discuss in stand-up a) whether this is the right test for that, b) there's a better way to keep the version tested in sync, and c) whether what this tests is also covered during our pull-request tests.

rogthefrog commented 3 months ago

The initial goal is to make sure the published Pypi package still works (not the local version). Remove the hardcoded package version and use latest.

rogthefrog commented 3 months ago

OK I think this is good to go @wpietri