I was training a DistilBERT model on SageMaker instance using fast-bert. I am using the ml.p2.xlarge instance for GPU processing.
When the function downloads the training image from ECR during fit(), I happen to receive "/usr/bin/env: ‘python\r’: No such file or directory". See below -
And, at the end of stack-trace received the following - error for training job failed. reason: algorithmerror: exit code: 127
Hello,
I was training a DistilBERT model on SageMaker instance using fast-bert. I am using the ml.p2.xlarge instance for GPU processing.
When the function downloads the training image from ECR during fit(), I happen to receive "/usr/bin/env: ‘python\r’: No such file or directory". See below -
And, at the end of stack-trace received the following - error for training job failed. reason: algorithmerror: exit code: 127
Tech Stack-
fast-bert docker image SageMaker NB Instance - ml.t2.medium GPU Compute - ml.p2.xlarge
What could be the reason for this error? My IAM role has all the required permissions.
Kindly help.