nasaharvest / crop-mask

End-to-end workflow for generating high resolution cropland maps
Apache License 2.0
95 stars 28 forks source link

Training a model inside a Github Action #262

Closed ivanzvonkov closed 1 year ago

ivanzvonkov commented 1 year ago

Context: To create a crop map the pipeline is 1) Add evaluation data, 2) train a model, 3) deploy model 4) create map. As the project matures this pipeline has gotten steadily more automation. For example 1) and 3) are almost completely run inside Github Actions.

Problem: Currently training a model is not as automated as can be. It requires either running the training script locally or using the Colab notebook which is not always updated by a certain developer.

Potential solution: 2) Training a model can be moved entirely into a Github action. This can be implemented similar to the deploy action: https://github.com/nasaharvest/crop-mask/blob/master/.github/workflows/deploy.yml The Train Github action should be able to take training parameters and use those parameters to run the training script on Github's machines. Similar to the Test Github action (https://github.com/nasaharvest/crop-mask/blob/master/.github/workflows/test.yml) it should push the new model to the branch. The training script should have wandb enabled and will require a wandb "bot" account for logging.

bhyeh commented 1 year ago

Current draft of gh-action train workflow:

https://github.com/nasaharvest/crop-mask/blob/github_action_training/.github/workflows/train.yml

Notes:

  1. At the moment: no user-definable training sets (use all available training datasets by default) - next step is looking into optional input if a user would want to define specific training sets to use.
  2. At the moment: workflow checkouts model w/ a remote branch and creates pull request. @ivanzvonkov mentioned in Monday's meeting about running workflow directly from issue instead (?)
  3. Awaiting (potentially); Weights and Biases API key secret for 'bot' authentication inside action for logging.

Train a new model now through Github CLI: gh workflow run train.yml -f MODEL_NAME=Ethiopia_Tigray_2020 -f EVAL_DATASETS=Ethiopia_Tigray_2020 -f BBOX=Ethiopia_Tigray -f UP_TO_YEAR=2021

bhyeh commented 1 year ago

Notes:

  1. User specify branch for action to run on. Master branch by default - selectable branch (assuming exists) and no branch or PR is created
bhyeh commented 1 year ago

Pull request @ #268

bhyeh commented 1 year ago

Added:

  1. Added handling for branch specification

Train model through GitHub CLI and use the ref flag to specify branch to run workflow from:

  1. No branch specified - run from master branch by default and creates PR w/model.
  1. Run from a specified branch branch-name and no PR or branch is created. Assumes branch-name exists.
bhyeh commented 1 year ago

Problem: Running the workflow results in this error: Error: google-github-actions/auth failed with: retry function failed after 1 attempt: failed to parse service account key JSON credentials: unexpected token  in JSON at position 0 Potential Solution: In train.yml under step 'Authenticate to Google Cloud' change line 77 from $GCP_SA_KEY to just ${{ secrets.GCP_SA_KEY }}