Closed ChoiByungWook closed 4 years ago
Hi, thanks for the update. So I tried again but I still got the same error! Weird that it seems like it's still referencing the old file.
Traceback (most recent call last):
File "/usr/local/bin/dockerd-entrypoint.py", line 8, in <module>
serving.main()
File "/usr/local/lib/python3.6/site-packages/sagemaker_mxnet_serving_container/serving.py", line 42, in main
model_server.start_model_server(handler_service=HANDLER_SERVICE)
File "/usr/local/lib/python3.6/site-packages/sagemaker_inference/model_server.py", line 63, in start_model_server
'/dev/null'])
File "/usr/local/lib/python3.6/subprocess.py", line 287, in call
with Popen(*popenargs, **kwargs) as p:
File "/usr/local/lib/python3.6/subprocess.py", line 729, in __init__
restore_signals, start_new_session)
File "/usr/local/lib/python3.6/subprocess.py", line 1364, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
OSError: [Errno 14] Bad address: 'tail'
Here is the code that I'm running
import sagemaker
from sagemaker.mxnet.model import MXNetModel
from sagemaker import get_execution_role
sagemaker_session = sagemaker.Session()
bucket = sagemaker_session.default_bucket()
role = get_execution_role()
print(sagemaker.__version__) #prints version 1.43.4.post1
batch_input = 's3://{}/test_images'.format(bucket)
batch_output = 's3://{}/combined_results'.format(bucket)
sagemaker_model = MXNetModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/yolo_object_person_detector.tar.gz',
role = role,
entry_point = 'combined_entry_point.py',
dependencies=['requirements.txt'],
py_version='py3',
framework_version='1.4.1',
sagemaker_session = sagemaker_session)
transformer = sagemaker_model.transformer(instance_count=1, instance_type='ml.m4.xlarge', output_path=batch_output)
transformer.transform(data=batch_input, content_type='application/x-image')
Am I missing some steps here? As far as I know the changes are all automatic right?
@laurenyu @ChoiByungWook
Or maybe perhaps the docker image where sagemaker-python-sdk if pulling from is not updated with the new code changes yet?
Issue #, if available: User running into memory issues due to tail. See for more information: https://github.com/aws/sagemaker-inference-toolkit/issues/9
Description of changes:
User ran into an error due to the command "tail -f /dev/null".
The tail call is meant to keep the container running, instead I now wait on the server process to finish or return an error code. The reason why the process that is responsible for starting the mxnet-model-server can't be used to wait is because MMS starts another subprocess, which for some reason can't be tracked by calling
children()
on the mms_process. When looking at the parent of that child's process, it points to bash.For this reason we look for the cmdline that the process was created to do. Which comes from here: https://github.com/awslabs/mxnet-model-server/blob/master/mms/model_server.py#L56
Testing
MXNet serving I modified MXNet 1.4.1 CPU Dockerfile to install the modified version of this package and ran the local and SageMaker integration tests. local
sagemaker
PyTorch
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.