aws-samples / amazon-sagemaker-local-mode

Amazon SageMaker Local Mode Examples
MIT No Attribution
242 stars 59 forks source link

Do not work with 2.24.1 - Processing examples #7

Closed ClaudioLucioLopes closed 2 years ago

ClaudioLucioLopes commented 3 years ago

When running the container in the local docker there is an error indicating the entry point. What is the version that the examples work?

eitansela commented 3 years ago

Hello,

Tried TensorFlow and PyTorch examples with SageMaker SDK v2.24.1 and it worked. Which example are you trying to run? Are you using Windows or Mac computer?

ClaudioLucioLopes commented 3 years ago

Hi, I've tried the processing examples. I am working on the linux ubuntu 18.04, I will try on Windows. May you check if the processing examples are running with this version?

ClaudioLucioLopes commented 3 years ago

I´ve tried the Scikitlearn as well. Are there any reasons to not work with scikitlearn?

eitansela commented 3 years ago

This will work only on Mac/Linux, as described in the overview part of this repo.

Can you attach the full stack of errors you got?

ClaudioLucioLopes commented 3 years ago

(localsm) claudiolucio@claudiolucio-G3-3579:~/Downloads/amazon-sagemaker-local-mode-main/scikit_learn_local_processing$ python3 SKLearnProcessor_local_processing.py Starting processing job. Note: if launching for the first time in local mode, container image download might take a few minutes to complete.

Job Name: sagemaker-scikit-learn-2021-02-03-13-37-50-747 Inputs: [{'InputName': 'input-1', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-us-east-1-168750349473/sagemaker-scikit-learn-2021-02-03-13-37-50-747/input/input-1', 'LocalPath': '/opt/ml/processing/input_data/', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'code', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-us-east-1-168750349473/sagemaker-scikit-learn-2021-02-03-13-37-50-747/input/code/processing_script.py', 'LocalPath': '/opt/ml/processing/input/code', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}] Outputs: [{'OutputName': 'word_count_data', 'AppManaged': False, 'S3Output': {'S3Uri': 's3://sagemaker-us-east-1-168750349473/sagemaker-scikit-learn-2021-02-03-13-37-50-747/output/word_count_data', 'LocalPath': '/opt/ml/processing/processed_data/', 'S3UploadMode': 'EndOfJob'}}] Building with native build. Learn about native build in Compose here: https://docs.docker.com/go/compose-native-build/ Creating m948e7d88d-algo-1-feydm ... Creating m948e7d88d-algo-1-feydm ... done Attaching to m948e7d88d-algo-1-feydm m948e7d88d-algo-1-feydm | python3: can't open file '/opt/ml/processing/input/code/processing_script.py': [Errno 2] No such file or directory m948e7d88d-algo-1-feydm exited with code 2 2 Aborting on container exit... Traceback (most recent call last): File "/home/claudiolucio/anaconda3/envs/localsm/lib/python3.7/site-packages/sagemaker/local/image.py", line 157, in process _stream_output(process) File "/home/claudiolucio/anaconda3/envs/localsm/lib/python3.7/site-packages/sagemaker/local/image.py", line 887, in _stream_output raise RuntimeError("Process exited with code: %s" % exit_code) RuntimeError: Process exited with code: 2

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "SKLearnProcessor_local_processing.py", line 67, in arguments=['job-type', 'word-count'] File "/home/claudiolucio/anaconda3/envs/localsm/lib/python3.7/site-packages/sagemaker/processing.py", line 494, in run experiment_config=experiment_config, File "/home/claudiolucio/anaconda3/envs/localsm/lib/python3.7/site-packages/sagemaker/processing.py", line 695, in start_new processor.sagemaker_session.process(process_args) File "/home/claudiolucio/anaconda3/envs/localsm/lib/python3.7/site-packages/sagemaker/session.py", line 856, in process self.sagemaker_client.create_processing_job(process_request) File "/home/claudiolucio/anaconda3/envs/localsm/lib/python3.7/site-packages/sagemaker/local/local_session.py", line 125, in create_processing_job ProcessingInputs, ProcessingOutputConfig, Environment, ProcessingJobName File "/home/claudiolucio/anaconda3/envs/localsm/lib/python3.7/site-packages/sagemaker/local/entities.py", line 135, in start processing_inputs, processing_output_config, environment, processing_job_name File "/home/claudiolucio/anaconda3/envs/localsm/lib/python3.7/site-packages/sagemaker/local/image.py", line 162, in process raise RuntimeError(msg) from e RuntimeError: Failed to run: ['docker-compose', '-f', '/tmp/tmp6d72okpw/docker-compose.yaml', 'up', '--build', '--abort-on-container-exit']

eitansela commented 3 years ago

Tried this example on a fresh Ubuntu 18.04 EC2 and it works.

ubuntu@ip-XXX-XX-XX-XX:~/amazon-sagemaker-local-mode/scikit_learn_local_processing$ pip3 freeze|grep sagemaker
sagemaker==2.24.1

ubuntu@ip-XXX-XX-XX-XX:~/amazon-sagemaker-local-mode/scikit_learn_local_processing$ python3 SKLearnProcessor_local_processing.py 
Starting processing job.
Note: if launching for the first time in local mode, container image download might take a few minutes to complete.

Job Name:  sagemaker-scikit-learn-2021-02-03-16-23-17-668
Inputs:  [{'InputName': 'input-1', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-us-east-1-XXXXXXXXXXXX/sagemaker-scikit-learn-2021-02-03-16-23-17-668/input/input-1', 'LocalPath': '/opt/ml/processing/input_data/', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'code', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-us-east-1-XXXXXXXXXXXX/sagemaker-scikit-learn-2021-02-03-16-23-17-668/input/code/processing_script.py', 'LocalPath': '/opt/ml/processing/input/code', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}]
Outputs:  [{'OutputName': 'word_count_data', 'AppManaged': False, 'S3Output': {'S3Uri': 's3://sagemaker-us-east-1-XXXXXXXXXXXX/sagemaker-scikit-learn-2021-02-03-16-23-17-668/output/word_count_data', 'LocalPath': '/opt/ml/processing/processed_data/', 'S3UploadMode': 'EndOfJob'}}]
Building with native build. Learn about native build in Compose here: https://docs.docker.com/go/compose-native-build/
Creating yygjgoue9s-algo-1-j743w ... 
Creating yygjgoue9s-algo-1-j743w ... done
Attaching to yygjgoue9s-algo-1-j743w
yygjgoue9s-algo-1-j743w | Processing Started
yygjgoue9s-algo-1-j743w | Received arguments {'job-type': 'word-count'}
yygjgoue9s-algo-1-j743w | Reading input data from /opt/ml/processing/input_data/
yygjgoue9s-algo-1-j743w | Got Args: {'job-type': 'word-count'}
yygjgoue9s-algo-1-j743w | Available input text files: ['sample_file_2.txt', 'sample_file_3.txt', 'sample_file_1.txt']
yygjgoue9s-algo-1-j743w | Word Count Job Type Started
yygjgoue9s-algo-1-j743w | Detected 6 words in sample_file_2.txt file
yygjgoue9s-algo-1-j743w | Detected 6 words in sample_file_3.txt file
yygjgoue9s-algo-1-j743w | Detected 6 words in sample_file_1.txt file
yygjgoue9s-algo-1-j743w | Total words in 3 files detected: 18
yygjgoue9s-algo-1-j743w | Writing output file: /opt/ml/processing/processed_data/total_words_03022021_16_23_19.txt
yygjgoue9s-algo-1-j743w | Available output text files: ['total_words_03022021_16_23_19.txt']
yygjgoue9s-algo-1-j743w | Processing Complete
yygjgoue9s-algo-1-j743w exited with code 0
Aborting on container exit...
===== Job Complete =====
.{'Outputs': [{'OutputName': 'word_count_data', 'AppManaged': False, 'S3Output': {'S3Uri': 's3://sagemaker-us-east-1-XXXXXXXXXXXX/sagemaker-scikit-learn-2021-02-03-16-23-17-668/output/word_count_data', 'LocalPath': '/opt/ml/processing/processed_data/', 'S3UploadMode': 'EndOfJob'}}]}
Output file is located on: s3://sagemaker-us-east-1-XXXXXXXXXXXX/sagemaker-scikit-learn-2021-02-03-16-23-17-668/output/word_count_data

Did you installed sagemaker local? pip install 'sagemaker[local]

ClaudioLucioLopes commented 3 years ago

Yes, I installed everything.May you inform the versions? Because when I install from dev the error changes

eitansela commented 3 years ago

Hello @ClaudioLucioLopes, I didn't understand what did you do installing from dev. Can you please explain?