Missing modules/packages in Run unit tests #267

Closed martin-weber closed 4 years ago

martin-weber commented 4 years ago

It looks as the step Run unit tests in Model CI Pipeline has not yet the packages from ci_dependecies.yml or conda_dependencies.yml applied. If I get it right, the conda packages are first installed later by step the Publish Azure Machine Learning Pipeline when creating the environment.

What is the best way to have the same modules/packages installed for the unit tests and for the later Environment?

eedorenko commented 4 years ago

The pipeline steps are running in a Docker container ( which has all packages installed from ci_dependecies.yml

martin-weber commented 4 years ago

Thanks @eedorenko . If I understand that right, I need to create a custom container if I want to add packages in ci_dependencies.yml and have them ready for my unit tests. Is that right?

So far I did not use a custom container as in conda_dependencies.yml the comment says:

# Conda environment specification. The dependencies defined in this file will
# be automatically provisioned for managed runs. These include runs against
# the localdocker, remotedocker, and cluster compute targets.

Does this mean that the dependencies defined here should be available by the CI unit test? I added lightgbm in my conda_dpendencies.yml (see below) and in the ci_dependecies.yml (see below) but do not find it in the conda and pip packages (see logs below).


name: mlopsohmw_training_env
  # The python interpreter version.
  # Currently Azure ML Workbench only supports 3.5.2 and later.
  - python=3.7.*
  - pip

  - pip:
    # Base AzureML SDK
    - azureml-sdk==1.3.*

    # Minimum required for the scoring environment. Must match AzureML SDK version.
    - azureml-defaults==1.3.*

    # Training deps
    - numpy==1.18.*
    - pandas==1.0.*
    - scikit-learn==0.22.*
    - lightgbm==2.3.*

    # Scoring deps
    - inference-schema[numpy-support]

    # MLOps with R
    - azure-storage-blob

    # current project
    - azureml-dataprep
    - azureml-monitoring


name: mlopspython_ci

  # The python interpreter version.
  - python=3.7.*

  # dependencies with versions aligned with conda_dependencies.yml.
  - numpy=1.18.*
  - pandas=1.0.*
  - scikit-learn=0.22.*
  - lightgbm=2.3.*
  # dependencies for MLOps with R.
  - r=3.6.0
  - r-essentials=3.6.0

  - pip=20.0.*

  - pip:
    # dependencies with versions aligned with conda_dependencies.yml.
    - azureml-sdk==1.3.*
    # Additional pip dependencies for the CI environment.
    - pytest==5.4.*
    - pytest-cov==2.8.*
    - requests==2.23.*
    - python-dotenv==0.12.*
    - flake8==3.7.*
    - flake8_formatter_junit_xml==0.0.*
    - azure-cli==2.3.*
    - lightgbm==2.3.*

I added a step to show the environment at the beginning of .pipelines\code-quality-template.yml

- bash: |   
    conda info -e
    conda list
    pip list
  displayName: 'show environment'

These are the outputs:

It is also confusing why the base conda environment is active. conda info -e

# conda environments:
base                  *  /usr/local
mlopspython_ci           /usr/local/envs/mlopspython_ci

eedorenko commented 4 years ago

Yes, @martin-weber, you understand that right.

martin-weber commented 4 years ago

Thanks @eedorenko. Ok that solves my issue.