Closed algoscale1 closed 5 years ago
Hi @algoscale1 ,
Thanks for using sagemaker!
For mxnet training, there's an argument source_dir that can be set for additional dependencies in mxnet estimator.
README doc: source_dir Path (absolute or relative) to a directory with any other training source code dependencies aside from the entry point file. Structure within this directory will be preserved when training on SageMaker.
Source code: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/mxnet/estimator.py#L24
Thanks!
Hi @yangaws
I tried to provide the source_dir path to the site-packages directory but it didn't work. Is there any example that installs additional dependencies in mxnet?
Hi @algoscale1 ,
I am really sorry. I just checked the codes and found this external libraries import feature is only available in some of our frameworks. Unfortunately, mxnet is not one of them.
So what we recommend to do is to install the dependencies within codes:
https://stackoverflow.com/questions/12332975/installing-python-module-within-code
Sorry again for giving inaccurate answer at beginning.
BTW we have the task to enable this feature for mxnet in our backlog. In the future, there will be a better way than this pip-in-code to import additional dependencies.
I find my question related:
I think pip-in-code is not possible to upgrade existing modules? In particular, I want to upgrade mxnet to the latest version (pre-release). While I can install the newer versions, the import will always default to 1.1.0.
Any walk-arounds? Does it make sense to somehow include mxnet in source_dir?
Solved my own problem reading through this line:
# For building images of MXNet versions 1.1 and above
docker build -t preprod-mxnet:1.1.0-cpu-py2 --build-arg py_version=2
--build-arg framework_installable=mxnet-1.1.0-py2.py3-none-manylinux1_x86_64.whl -f Dockerfile.cpu .
Closing due to inactivity. Feel free to reopen if necessary.
@yangaws Hi yang, can you be more specific? if i use sklearn estimator, there's no requirements or env parameters for me to specify external package names... Detailed problem: sklearn_estimator = SKLearn(entry_point='text.py', *args) ... in the text.py, i need to import external package hasn't been installed for example "NLTK" ?
Hi @Seninus , If you want to use NLTK in your text.py script in SageMaker. You can install NLTK yourself in the script.
For how to do that you can refer to this: https://stackoverflow.com/questions/12332975/installing-python-module-within-code
But I have not followed updates in this repo for some time. Hence I am not sure if what I said is still recommended by SageMaker. I suggest you reopen this issue to confirm.
@Seninus You can include a requirements.txt file in your source directory. For more, see https://sagemaker.readthedocs.io/en/stable/using_sklearn.html#using-third-party-libraries
Thanks @laurenyu , i did figure out including th requirement.txt in source_dir... but it failed to pip install with SSL certificate error. the training instance is in my VPC, my company net might block this external pip install, i am still tryin to figure it out... let me know if my direction is wrong..
Hi,
I have using Mxnet for deploying xgboost model on sagemaker. I have created a script in which I have all the required train and inference functions like train(), input_fn() etc.
mnist_estimator = MXNet(entry_point='mnist.py', role=role, output_path=model_artifacts_location, code_location=custom_code_upload_location, train_instance_count=1, train_instance_type='ml.m4.xlarge', py_version='py3')
I am trying to use pandas library in the train function in the mxnet script, but i am getting this error
If i am not wrong mxnet is creating its own environment where all these external libraries are not present. is there any way i can use these external libraries?