Azure / azureml-sdk-for-r

Azure Machine Learning SDK for R
https://azure.github.io/azureml-sdk-for-r/
Other
104 stars 40 forks source link

pip version cannot be set to 20.1.1 to mitigate ruamel issue as described in trouble shooting guide #408

Closed robert4os closed 3 years ago

robert4os commented 3 years ago

Describe the bug

Hi,

in an attempt to mitigate the 'No module named ruamel' issue, I am trying to specify the pip version to 20.1.1

I do this following the trouble shooting guide at: https://azure.github.io/azureml-sdk-for-r/articles/troubleshooting.html#modulenotfounderror-no-module-named-ruamel

env <- r_environment(name = "my-env")
env$python$conda_dependencies$add_conda_package('pip==20.1.1')

The run information shows the correct use of this environment and its correct setup.... However, the pip version in the actual environment is not 20.1.1 but higher.

I can check this from within my estimator with: system('pip --version') pip 20.3.3 from /opt/miniconda/lib/python3.7/site-packages/pip (python 3.7) And of course I still encounter the ruamel issue.

Thank you and best regards, Robert

To Reproduce

Try specifying the pip version.

env <- r_environment(name = "my-env")
env$python$conda_dependencies$add_conda_package('pip==20.1.1')

Check actual pip version: system('pip --version')

Steps to reproduce the behavior:

Expected behavior

Specified pip version should be installed and used.

robert4os commented 3 years ago

Workaround:

I have noticed that the environment created by r_environment sets the 'baseDockerfile' to:

"FROM mcr.microsoft.com/azureml/base:openmpi3.1.2-ubuntu16.04\nRUN conda install -c r -y r-essentials=3.6.0 r-reticulate rpy2 r-remotes r-e1071 r-optparse && conda clean -ay && pip install --no-cache-dir azureml-defaults\nENV TAR=\"/bin/tar\"\nRUN R -e \"remotes::install_cran('azuremlsdk', repos = 'https://cloud.r-project.org/', upgrade = FALSE)\"\n"

See screenshot too: image

By overwriting the baseDockerfile with a command that also installs pip=20.1.1 I was able to work around the problem:

r_env <- r_environment(name = "basic_env")
r_env$docker$base_dockerfile="FROM mcr.microsoft.com/azureml/base:openmpi3.1.2-ubuntu16.04\nRUN conda install -c r -y pip=20.1.1 r-essentials=3.6.0 r-reticulate rpy2 r-remotes r-e1071 r-optparse && conda clean -ay && pip install --no-cache-dir azureml-defaults\nENV TAR=\"/bin/tar\"\nRUN R -e \"remotes::install_cran('azuremlsdk', repos = 'https://cloud.r-project.org/', upgrade = FALSE)\"\n"

This will then install the desired pip version, instead of the latest.

e.g. it states in 20_image_build_log.txt: pip: 10.0.1-py37_0 --> 20.1.1-py37_1

diondrapeck commented 3 years ago

@robert4os Thank you so much for finding and sharing this workaround. We will update the troubleshooting guide to include your solution as well.

cc: @mx-iao