Open daavoo opened 3 years ago
See previous discussion comments:
https://github.com/iterative/dvc/discussions/6542#discussioncomment-1287600
Just copy another comment to here:
I am more curious about the recommended way for this demo https://github.com/iterative/cml_cloud_case/blob/master/.github/workflows/cml.yaml as it is using AWS EC2 instance. However, in current GitHub action workflow, it does not do something like
dvc push
cml-pr .
In the experiment pull request, it also not update the dvc.lock
, etc. files, which is why I came up this question. (If we use AWS EC2 instance all time based on this approach, it will end with dvc.lock
never got update.
Would be great to have some recommendations! 😃
To be more specific, it would be great to cover these scenarios and still not mess up the DVC:
Another option that I was taught to do was use DVC's run-cache.
In the CML runner:
dvc repro
dvc push --run-cache
On local machine
dvc pull --run-cache
dvc repro --pull
# any further checks or analysis of results
git add .
git commit -m "commit experiment"
This is less automated than the new cml-pr
but has the benefit that the developers are making the commits, if that is important.
I've been wondering about all these questions myself and haven't found a satisfying answer yet.
@mattlbeck Wouldn't dvc push --run-cache
pollute the DVC remote storage? The amount of run-cache data will increase quickly over time. Perhaps some kind of garbage collection is needed.
Relevant links for any future readers:
Why? I'm assuming the use case to be for rerunning or inspecting dvc exp/repro conducted from CI/CD (user can preform commit / or / or discard changes instead of CI)
Discussed in https://github.com/iterative/dvc/discussions/6542
Click to expand!
```yaml cml-cloud-set-up-cloud: name: CML (Cloud) - Set up cloud runs-on: ubuntu-20.04 steps: - name: Cancel previous runs uses: styfle/cancel-workflow-action@0.9.1 with: access_token: ${{ github.token }} - name: Checkout uses: actions/checkout@v2 - name: Set up CML uses: iterative/setup-cml@v1 - name: Set up cloud shell: bash env: REPO_TOKEN: ${{ secrets.CML_ACCESS_TOKEN }} AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }} AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }} run: | cml-runner \ --cloud=aws \ --cloud-region=us-west-2 \ --cloud-type=t2.small \ --labels=cml-runner cml-cloud-train: name: CML (Cloud) - Train needs: cml-cloud-set-up-cloud runs-on: [self-hosted, cml-runner] # container: docker://iterativeai/cml:0-dvc2-base1-gpu container: docker://iterativeai/cml:0-dvc2-base1 steps: - name: Cancel previous runs uses: styfle/cancel-workflow-action@0.9.1 with: access_token: ${{ github.token }} - name: Checkout uses: actions/checkout@v2 - name: Set up Miniconda uses: conda-incubator/setup-miniconda@v2 with: miniconda-version: "latest" activate-environment: hm-cnn - name: Install requirements working-directory: convolutional-neural-network shell: bash -l {0} run: | conda install pytorch torchvision torchaudio --channel=pytorch conda install pandas conda install tabulate pip install -r requirements.txt - name: Pull Data working-directory: convolutional-neural-network env: AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }} AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }} run: | dvc pull - name: Train model working-directory: convolutional-neural-network shell: bash -l {0} env: WANDB_API_KEY: ${{ secrets.WANDB_API_KEY }} run: | dvc repro - name: Write CML report working-directory: convolutional-neural-network shell: bash -l {0} env: REPO_TOKEN: ${{ secrets.CML_ACCESS_TOKEN }} run: | echo "# CML (Cloud) Report" >> report.md echo "## Params" >> report.md cat output/reports/params.txt >> report.md cml-send-comment report.md ```