This project demonstrates the machine model training as a CI/CD system in GCP platform. You will see more detailed workflow in the below section, but it is about rebuilding and redeploying (continuous integration) the currently deployed machine learning pipeline based on changes in code. Such changes could happen in the training data, data pre-processing logic, model architecture and training code, custom pipeline components, and so on.
An accompanying blog post for this project is available on Google Cloud: Model training as a CI/CD system: Part I. Part II can be found here (code: sayakpaul/ CI-CD-for-Model-Training).
We create initial code, or we make some changes in the existing codebase for pipeline.
Based on the changes in the step 2, a GitHub action gets triggered to initiate a Cloud Build process.
The Cloud Build runs unit tests to see if those components work without errors.
If there is no error at all, there are two common sub-workflows from this point.
The final step of the Cloud Build is to execute a pipeline run on Vertex AI
We create initial code, or we make some changes in the existing codebase for modules.
Based on the changes in the step 2, a GitHub action gets triggered to initiate a Cloud Build process.
The Cloud Build runs unit tests to see if those components work without errors.
If there is no error at all, there are two common sub-workflows from this point.
The final step of the Cloud Build is to execute a pipeline run on Vertex AI. Trainer and Transform TFX components will look up the changed modules accordingly.
ML-GDE program for providing GCP credits. Thanks to Karl Weinmeister for providing review feedback on this project.