iterative / cml

♾️ CML - Continuous Machine Learning | CI/CD for ML
http://cml.dev
Apache License 2.0
4k stars 339 forks source link

Updating python3 version on dvcorg/cml-py3:runner #360

Closed ricoms closed 3 years ago

ricoms commented 3 years ago

I have some projects that require python 3.8 and I tried two options below:

  1. install python3.8 inside your dvcorg/cml-py3:runner image, although I'm being unable to mount volumes on the image. I don't know how to explore the reason behind this. I'm basically running docker run --rm -v scripts:/opt dvcorg/cml-py3:runner ./opt/test.sh although my answer is ..."exec: \"./opt/test.sh\": stat ./opt/test.sh: no such file or directory": unknown.. And the content of opt/test.sh is apt update && sudo apt install python3.8. Note: here I was not capable of finding which operating system the docker is using. Can I help to document and maintaining your docker images? Where is the repository generating them?

  2. run a container of my own that runs my training, like this at github actions:

    train-test:
    name: Train and report
    needs: [python-test]
    runs-on: [ubuntu-latest]
    container: docker://dvcorg/cml-py3:runner
    
    steps:
      - uses: actions/checkout@v2
    
      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v1
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-east-1
    
      - name: Create virtualenv
        shell: bash
        run: |
          pip install --no-cache-dir pipenv==2020.8.13
          pipenv install --dev
    
      - name: cml_run
        shell: bash
        env:
          repo_token: ${{ secrets.GITHUB_TOKEN }}
        run: |
          # run-cache and reproduce pipeline
          dvc pull ml/input/data/training/creditcard.csv.dvc
          dvc repro

Note: I had to upgrade DVC version as the original docker version was below 1.0 and requiring Dvcfile.

Continuing with my explanation: dvc repro in the code above calls a docker command like this docker run --rm -u ${CURRENT_UID}:${CURRENT_UID} -v ${PWD}/ml:/opt/m ${DOCKER_IMAGE_NAME} train --project_name ${project-name} --input_dir /opt/${DATA_FILE} from a Makefile... The problem here is the volume mounting. On the logs, it shows that it built the volume, although the container code can't find the files that are supposed to be in that volume.

If it helps I have an open example here: https://github.com/ricoms/credit-fraud-dealing-with-imbalanced-datasets-mlops

DavidGOrtega commented 3 years ago

πŸ‘‹ @ricoms

please replace

container: docker://dvcorg/cml-py3:runner

with

container: docker://dvcorg/cml-py3:latest

Actually not sure what is that runner tag πŸ€” Might be a stale version. Did you find it in a tutorial?

Can I help to document and maintaining your docker images?

Absolutely!

Where is the repository generating them?

This one! the Dockerfiles are in the docker folder and Github actions inside .github/workflows are building the images.

diegobit commented 3 years ago

Hi, in our company we have some workflows with python 3.8 and cuda11, it would be nice to have a cml image with python 3.8 and cuda 11 already available. Thanks

DavidGOrtega commented 3 years ago

@diegobit have you tried to use setup-python action instead? It does not solve the CUDA part but it would be a start

DavidGOrtega commented 3 years ago

related to #217

0x2b3bfa0 commented 3 years ago

@ricoms & @diegobit, we've released new image tags with updated package versions: docker://dvcorg/cml:0-dvc2-base1-gpu includes Ubuntu 20.04, Python 3.8, CUDA 11.0.3 and CuDNN 8. πŸŽ‰

diegobit commented 3 years ago

Thanks! πŸ™