I am able to successfully get this example working no problem. I can adapt it to my purpose. However I am now attempting to use the AWS SageMaker Javascript SDK in order to accomplish the same task. I don't quite understand how source_dir in the jupyter notebook instance gets transferred over to the sagemaker training instance.
Is this done by the python sagemaker SDK? Can someone comment on how this could be done via the javascript sagemaker SDK?
The following successfully launches a trading instance. The image is downloaded from ECR, however the training fails. I suspect it is because source_dir has not been copied over to the sagemaker training instance.
2021-10-08 18:30:19,173 sagemaker-training-toolkit ERROR framework error:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/sagemaker_training/trainer.py", line 97, in train
runner_type=runner_type,
File "/usr/local/lib/python3.6/dist-packages/sagemaker_training/entry_point.py", line 92, in run
files.download_and_extract(uri=uri, path=environment.code_dir)
File "/usr/local/lib/python3.6/dist-packages/sagemaker_training/files.py", line 131, in download_and_extract
s3_download(uri, dst)
File "/usr/local/lib/python3.6/dist-packages/sagemaker_training/files.py", line 167, in s3_download
s3.Bucket(bucket).download_file(key, dst)
File "/usr/local/lib/python3.6/dist-packages/boto3/s3/inject.py", line 247, in bucket_download_file
ExtraArgs=ExtraArgs, Callback=Callback, Config=Config)
File "/usr/local/lib/python3.6/dist-packages/boto3/s3/inject.py", line 173, in download_file
extra_args=ExtraArgs, callback=Callback)
File "/usr/local/lib/python3.6/dist-packages/boto3/s3/transfer.py", line 307, in download_file
future.result()
File "/usr/local/lib/python3.6/dist-packages/s3transfer/futures.py", line 106, in result
return self._coordinator.result()
File "/usr/local/lib/python3.6/dist-packages/s3transfer/futures.py", line 265, in result
raise self._exception
File "/usr/local/lib/python3.6/dist-packages/s3transfer/tasks.py", line 255, in _main
self._submit(transfer_future=transfer_future, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/s3transfer/download.py", line 343, in _submit
**transfer_future.meta.call_args.extra_args
File "/usr/local/lib/python3.6/dist-packages/botocore/client.py", line 386, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/local/lib/python3.6/dist-packages/botocore/client.py", line 705, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (404) when calling the HeadObject operation: Not Found
Any insight on what causes this error or insights on how to get this same example functioning in the Sagemaker Javascript SDK would be greatly appreciated.
Hello,
I am able to successfully get this example working no problem. I can adapt it to my purpose. However I am now attempting to use the AWS SageMaker Javascript SDK in order to accomplish the same task. I don't quite understand how
source_dir
in the jupyter notebook instance gets transferred over to the sagemaker training instance.Is this done by the python sagemaker SDK? Can someone comment on how this could be done via the javascript sagemaker SDK?
The following successfully launches a trading instance. The image is downloaded from ECR, however the training fails. I suspect it is because
source_dir
has not been copied over to the sagemaker training instance.Sagemaker training fails with the following error
Any insight on what causes this error or insights on how to get this same example functioning in the Sagemaker Javascript SDK would be greatly appreciated.