aws / aws-k8s-tester

AWS Kubernetes tester, kubetest2 deployer implementation
Apache License 2.0
163 stars 82 forks source link

Add single node Neuron test to the e2e tester #450

Closed weicongw closed 4 months ago

weicongw commented 4 months ago

Issue #, if available:

Description of changes: This PR adds single-node Neuron tests to the e2e2 tester. These tests serve as unit tests for the Neuron device and include the following:

These test scripts are replicated from https://github.com/aws/deep-learning-containers/blob/master/test/dlc_tests/container_tests/bin/pytorch_tests

Testing

 go test -v . -args -neuronSingleNodeTestImage public.ecr.aws/o5d5x8n6/weicongw:latest
=== RUN   TestMPIJobPytorchTraining
=== RUN   TestMPIJobPytorchTraining/single-node
=== RUN   TestMPIJobPytorchTraining/single-node/Single_node_test_Job_succeeds
--- PASS: TestMPIJobPytorchTraining (110.44s)
    --- PASS: TestMPIJobPytorchTraining/single-node (110.44s)
        --- PASS: TestMPIJobPytorchTraining/single-node/Single_node_test_Job_succeeds (110.08s)
PASS
ok      github.com/aws/aws-k8s-tester/e2e2/test/cases/neuron    117.961s

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

cartermckinnon commented 4 months ago

also please add a CI job that tests the image build for the new dockerfile 👍