mlcommons / modelgauge

Make it easy to automatically and uniformly measure the behavior of many AI Systems.
https://mlcommons.org/ai-safety/
Apache License 2.0
26 stars 7 forks source link

Investigate why modelgauge test runs fail #506

Closed rogthefrog closed 3 months ago

rogthefrog commented 3 months ago

https://github.com/mlcommons/modelgauge/actions/runs/10341862847/job/28624140017

rogthefrog commented 3 months ago

There are two "fixes":

The first fix is fairly non-controversial :)

The second fix requires a conversation with @wpietri to clarify the goal of this particular test.

wpietri commented 3 months ago

I think back in the day this was to make sure that our released artifact continued to work. I suspect that's still a good thing to do, but let's discuss in stand-up a) whether this is the right test for that, b) there's a better way to keep the version tested in sync, and c) whether what this tests is also covered during our pull-request tests.

rogthefrog commented 3 months ago

The initial goal is to make sure the published Pypi package still works (not the local version). Remove the hardcoded package version and use latest.

rogthefrog commented 3 months ago

OK I think this is good to go @wpietri