aws-samples / amazon-sagemaker-local-mode

Amazon SageMaker Local Mode Examples
MIT No Attribution
242 stars 59 forks source link

yolov5 sagemaker deploy #32

Open omaiyiwa opened 1 year ago

omaiyiwa commented 1 year ago

1.Hello, in amazon-sagemaker-local-mode/pytorch_yolov5_local_model_inference/, the local deploy will have this error, ImportError: 'docker-compose' is not installed. Local Mode features will not work without docker-compose. For more information on how to install 'docker-compose', please, see https://docs.docker.com/compose/install/, 2.I took the recommended solution, but it didn't work (maybe I made a mistake), 3.I use curl -SL https: //github.com/docker/compose/releases/download/v2.17.2/docker-compose-linux-x86_64 -o /usr/local/bin/docker-compose, 4.the error is % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 Warning: Failed to open the file /usr/local/bin/docker-compose: Permission Warning: denied 0 51.9M 0 1406 0 0 1780 0 8:29:52 --:--:-- 8:29:52 1780 curl: (23) Failure writing output to destination

eitansela commented 1 year ago

It looks like you have a Permission denied issue with docker-compose. Please try to run it on a simple example: https://docs.docker.com/compose/gettingstarted/

omaiyiwa commented 1 year ago

Thanks, I keep getting this error in this paragraph predictions = predictor.predict("https://ultralytics.com/images/zidane.jpg")

1.the error is botocore.errorfactory.ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (0) from primary with message "Your invocation timed out while waiting for a response from container primary. Review the latency metrics for each container in Amazon CloudWatch, resolve the issue, and try again."

omaiyiwa commented 1 year ago

my deploy is predictor = model.deploy(initial_instance_count=1, instance_type='ml.m5.xlarge', endpoint_name=endpoint_name)

omaiyiwa commented 1 year ago
  1. Sorry for my problem, I specified not a local but an instance,
  2. I installed docker-compose and added executable permissions,
  3. docker-compose version
  4. Docker Compose version v2.17.2,
  5. but the following error occurs during deployment:

File "/home/sagemaker-user/pytorch_yolov5_local_model_inference.py", line 63, in main() File "/home/sagemaker-user/pytorch_yolov5_local_model_inference.py", line 48, in main predictor = model.deploy(initial_instance_count=1, instance_type='local', endpoint_name=endpoint_name) File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker/model.py", line 1298, in deploy self.sagemaker_session.endpoint_from_production_variants( File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker/session.py", line 4539, in endpoint_from_production_variants return self.create_endpoint(endpoint_name=name, config_name=name, tags=tags, wait=wait) File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker/session.py", line 3899, in create_endpoint self.sagemaker_client.create_endpoint( File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker/local/local_session.py", line 359, in create_endpoint endpoint.serve() File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker/local/entities.py", line 603, in serve self.container.serve( File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker/local/image.py", line 296, in serve if _ecr_login_if_needed(self. sagemaker_session. boto_session, self. image): File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker/local/image.py", line 1086, in _ecr_login_if_needed if _check_output("docker images -q %s" % image).strip(): File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker/local/image.py", line 934, in _check_output output = subprocess. check_output(cmd, *popenargs, kwargs) File "/opt/conda/envs/studio/lib/python3.9/subprocess.py", line 424, in check_output return run(popenargs, stdout=PIPE, timeout=timeout, check=True, File "/opt/conda/envs/studio/lib/python3.9/subprocess.py", line 505, in run with Popen(popenargs, kwargs) as process: File "/opt/conda/envs/studio/lib/python3.9/subprocess.py", line 951, in init self._execute_child(args, executable, preexec_fn, close_fds, File "/opt/conda/envs/studio/lib/python3.9/subprocess.py", line 1821, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) NotADirectoryError: [Errno 20] Not a directory: 'docker'

eitansela commented 1 year ago

This is an issue with Docker installation. Try to run a simple docker-compose command with a sample Docker image before running yolov5 code sample.