This PR, once installed/deployed, will cause all deployments to use a Kubernetes Work Pool running on OpenShift instead of a Docker Agent running on our Linux VM. It will also switch us from using MinIO for flow storage to using Github instead. It still uses MinIO for result storage.
Impacts:
When you deploy or run a flow from the command line using this release, your Github working tree must be clean, meaning all changes committed and pushed. And then the deployment will use the code from Github to run the flow.
This means you only need to deploy a flow once, and then subsequent pushes to Github will automatically update the deployment without the need to redeploy.
If you use the run command, it will still delete the deployment when finished for the sake of convenience. Redeploying and deleting the deployment each time you run a flow on a dev branch is not a big deal, especially since deploying from Github is faster than using MinIO
We will have to be more diligent about deleting dev deployments when no longer needed (if you aren't using "run"), and also about deleting Github branches we just used for testing purposes.
The way images are used is not changing. You can still use the --image-branch option to deploy with a different image than the default:main one.
You can still use the --label option to deploy from the main branch as if it were a dev branch or a dev branch as if it were a main branch. This is just to either A) test from main without sending an error message to Teams or B) test a dev branch on the main infrastructure to test things like firewall connections.
All flows will have a max memory usage of 16 GB. If a flow uses too much memory, Prefect will show it as Crashed. When you click into the Details tab on the flow run in Prefect Cloud, it will tell you Flow run process exited with non-zero status code -9, indicating the process was killed by the operating system (due to hitting the memory limit).
Deployment steps (see oit-ds-apps-prefect for more info):
[x] Ensure all worker helm charts are deployed
[x] Ensure the prod namespace has the needed secrets (dev is already done)
[x] Ensure all work pools are created
[x] Ensure PE has set the needed resource quotas and limit ranges for the namespaces
[x] Merge this PR and create the appropriate release
[x] Commit the appropriate changes in the images repo to rebuild main:default with this release, along with the prefect-aws package
[ ] Tell everyone to reinstall the requirements for the main:default image so they will start deploying with the new release
[ ] Make a plan to test and deploy each existing flow onto the new infrastructure (create a tracking spreadsheet to share progress)
This PR, once installed/deployed, will cause all deployments to use a Kubernetes Work Pool running on OpenShift instead of a Docker Agent running on our Linux VM. It will also switch us from using MinIO for flow storage to using Github instead. It still uses MinIO for result storage.
Impacts:
Flow run process exited with non-zero status code -9
, indicating the process was killed by the operating system (due to hitting the memory limit).Deployment steps (see oit-ds-apps-prefect for more info):