Using main branch, Infrastructure has been setup using terraform,
Docker build and pushed to the ECR.
When I run the ml-pipeline-terraform-demo-state-machine, it fails in step: Create Training Job
"FailureReason": "AlgorithmError: , exit code: 1",
I suspected it is something to do with permissions on the python files in the src, even after changing and updating docker file , rebuilt the docker image, I am unable to call python files, or could not ls the contents of /opt/program
Using main branch, Infrastructure has been setup using terraform, Docker build and pushed to the ECR. When I run the ml-pipeline-terraform-demo-state-machine, it fails in step: Create Training Job "FailureReason": "AlgorithmError: , exit code: 1",
I suspected it is something to do with permissions on the python files in the src, even after changing and updating docker file , rebuilt the docker image, I am unable to call python files, or could not ls the contents of /opt/program