awslabs / benchmark-ai

Anubis (formerly known as Benchmark AI), measures the goodness of machine learning workloads
Apache License 2.0
16 stars 6 forks source link

Mpijob migration #1041

Closed tejaschumbalkar closed 3 years ago

tejaschumbalkar commented 3 years ago

Description of changes: Migrating MPIJob template to adapt v1aplha2 version supported by the MPI operator.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

stsukrov commented 3 years ago

Please clarify the broken tests as well.

tejaschumbalkar commented 3 years ago

Please clarify the broken tests as well.

The test failures are due to

  1. Sagemaker 2.0 has a breaking change
  2. Python logging TypeError exception #1037

Both the failures are fixed in #1042

stsukrov commented 3 years ago

Please clarify the broken tests as well.

The test failures are due to

  1. Sagemaker 2.0 has a breaking change
  2. Python logging TypeError exception #1037

Both the failures are fixed in #1042

Thanks! It's comforting to see green tests