Added an unconditional skip to test_serving.py::test_serve_csv and test_serving.py::test_serve_cpu_model_on_gpu because they were showing issues, and I've concluded that they are test flaws regarding distinction between cpu and gpu. test_serving.py needs to be reevaluated and fixed/replaced with newer tests/models for torchserve. In the meantime, I removed DataParallel from test/sagemaker_tests/pytorch/inference/resources/mnist/model_cpu/1d/code/mnist_1d.py because 1. DataParallel is no longer recommended by pytorch and 2. it is supposed to be a cpu model. To account for this, I also replaced the model tar file with a new model.pth that is compatible with a network not using DataParallel.
NOTE: By default, docker builds are disabled. In order to build your container, please update dlc_developer_config.toml and specify the framework to build in "build_frameworks"
[x] I have run builds/tests on commit for my changes.
NOTE: If you are creating a PR for a new framework version, please ensure success of the standard, rc, and efa sagemaker remote tests by updating the dlc_developer_config.toml file:
Expand
- [ ] `sagemaker_remote_tests = true`
- [ ] `sagemaker_efa_tests = true`
- [ ] `sagemaker_rc_tests = true`
**Additionally, please run the sagemaker local tests in at least one revision:**
- [ ] `sagemaker_local_tests = true`
Expand
Fill out the template and click the checkbox of the builds you'd like to execute
*Note: Replace with with the major.minor framework version (i.e. 2.2) you would like to start.*
- [ ] build_pytorch_training__sm
- [ ] build_pytorch_training__ec2
- [x] build_pytorch_inference_2.3_sm
- [x] build_pytorch_inference_2.3_ec2
- [ ] build_pytorch_inference__graviton
- [ ] build_tensorflow_training__sm
- [ ] build_tensorflow_training__ec2
- [ ] build_tensorflow_inference__sm
- [ ] build_tensorflow_inference__ec2
- [ ] build_tensorflow_inference__graviton
Additional context
PR Checklist
Expand
- [ ] I've prepended PR tag with frameworks/job this applies to : [mxnet, tensorflow, pytorch] | [ei/neuron/graviton] | [build] | [test] | [benchmark] | [ec2, ecs, eks, sagemaker]
- [ ] If the PR changes affects SM test, I've modified dlc_developer_config.toml in my PR branch by setting sagemaker_tests = true and efa_tests = true
- [ ] If this PR changes existing code, the change fully backward compatible with pre-existing code. (Non backward-compatible changes need special approval.)
- [ ] (If applicable) I've documented below the DLC image/dockerfile this relates to
- [ ] (If applicable) I've documented below the tests I've run on the DLC image
- [ ] (If applicable) I've reviewed the licenses of updated and new binaries and their dependencies to make sure all licenses are on the Apache Software Foundation Third Party License Policy Category A or Category B license list. See [https://www.apache.org/legal/resolved.html](https://www.apache.org/legal/resolved.html).
- [ ] (If applicable) I've scanned the updated and new binaries to make sure they do not have vulnerabilities associated with them.
#### NEURON/GRAVITON Testing Checklist
* When creating a PR:
- [ ] I've modified `dlc_developer_config.toml` in my PR branch by setting `neuron_mode = true` or `graviton_mode = true`
#### Benchmark Testing Checklist
* When creating a PR:
- [ ] I've modified `dlc_developer_config.toml` in my PR branch by setting `ec2_benchmark_tests = true` or `sagemaker_benchmark_tests = true`
Pytest Marker Checklist
Expand
- [ ] (If applicable) I have added the marker `@pytest.mark.model("")` to the new tests which I have added, to specify the Deep Learning model that is used in the test (use `"N/A"` if the test doesn't use a model)
- [ ] (If applicable) I have added the marker `@pytest.mark.integration("")` to the new tests which I have added, to specify the feature that will be tested
- [ ] (If applicable) I have added the marker `@pytest.mark.multinode()` to the new tests which I have added, to specify the number of nodes used on a multi-node test
- [ ] (If applicable) I have added the marker `@pytest.mark.processor(<"cpu"/"gpu"/"eia"/"neuron">)` to the new tests which I have added, if a test is specifically applicable to only one processor type
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Description
test_serving.py::test_serve_csv
andtest_serving.py::test_serve_cpu_model_on_gpu
because they were showing issues, and I've concluded that they are test flaws regarding distinction between cpu and gpu. test_serving.py needs to be reevaluated and fixed/replaced with newer tests/models for torchserve. In the meantime, I removed DataParallel fromtest/sagemaker_tests/pytorch/inference/resources/mnist/model_cpu/1d/code/mnist_1d.py
because 1. DataParallel is no longer recommended by pytorch and 2. it is supposed to be a cpu model. To account for this, I also replaced the model tar file with a new model.pth that is compatible with a network not using DataParallel.Tests run
NOTE: By default, docker builds are disabled. In order to build your container, please update dlc_developer_config.toml and specify the framework to build in "build_frameworks"
NOTE: If you are creating a PR for a new framework version, please ensure success of the standard, rc, and efa sagemaker remote tests by updating the dlc_developer_config.toml file:
Expand
- [ ] `sagemaker_remote_tests = true` - [ ] `sagemaker_efa_tests = true` - [ ] `sagemaker_rc_tests = true` **Additionally, please run the sagemaker local tests in at least one revision:** - [ ] `sagemaker_local_tests = true`Formatting
black -l 100
on my code (formatting tool: https://black.readthedocs.io/en/stable/getting_started.html)DLC image/dockerfile
Builds to Execute
Expand
Fill out the template and click the checkbox of the builds you'd like to execute *Note: Replace withAdditional context
PR Checklist
Expand
- [ ] I've prepended PR tag with frameworks/job this applies to : [mxnet, tensorflow, pytorch] | [ei/neuron/graviton] | [build] | [test] | [benchmark] | [ec2, ecs, eks, sagemaker] - [ ] If the PR changes affects SM test, I've modified dlc_developer_config.toml in my PR branch by setting sagemaker_tests = true and efa_tests = true - [ ] If this PR changes existing code, the change fully backward compatible with pre-existing code. (Non backward-compatible changes need special approval.) - [ ] (If applicable) I've documented below the DLC image/dockerfile this relates to - [ ] (If applicable) I've documented below the tests I've run on the DLC image - [ ] (If applicable) I've reviewed the licenses of updated and new binaries and their dependencies to make sure all licenses are on the Apache Software Foundation Third Party License Policy Category A or Category B license list. See [https://www.apache.org/legal/resolved.html](https://www.apache.org/legal/resolved.html). - [ ] (If applicable) I've scanned the updated and new binaries to make sure they do not have vulnerabilities associated with them. #### NEURON/GRAVITON Testing Checklist * When creating a PR: - [ ] I've modified `dlc_developer_config.toml` in my PR branch by setting `neuron_mode = true` or `graviton_mode = true` #### Benchmark Testing Checklist * When creating a PR: - [ ] I've modified `dlc_developer_config.toml` in my PR branch by setting `ec2_benchmark_tests = true` or `sagemaker_benchmark_tests = true`Pytest Marker Checklist
Expand
- [ ] (If applicable) I have added the marker `@pytest.mark.model("By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.