We are hoping to be as cost efficient as possible while providing reasonable usability of the data processing steps.
For that we need to be clear about how the pricing works exactly, meaning the costs (flat fees and dynamic costs) of the cloud resources that will be used including GKE cluster configurations and cloud storage (e.g. storage buckets vs. NFS volumes) - especially where the biggest potential cost reductions lie.
There are also likely cost savings that will have drawbacks, be it in latency/time or other aspects, so in those cases we aim to find the point that has the best value per cost.
Afterwards, we should plan what things to prioritise in our testing, based possible cost-savings with the biggest impact, and after that proceed to the smaller cost reduction opportunities.
We are hoping to be as cost efficient as possible while providing reasonable usability of the data processing steps. For that we need to be clear about how the pricing works exactly, meaning the costs (flat fees and dynamic costs) of the cloud resources that will be used including GKE cluster configurations and cloud storage (e.g. storage buckets vs. NFS volumes) - especially where the biggest potential cost reductions lie.
There are also likely cost savings that will have drawbacks, be it in latency/time or other aspects, so in those cases we aim to find the point that has the best value per cost.
Some example resources to start from: https://cloud.google.com/kubernetes-engine/pricing https://cloud.google.com/products/calculator?hl=en https://cloud.google.com/kubernetes-engine/docs/how-to/cost-allocations https://www.cloudzero.com/blog/gke-pricing/
Afterwards, we should plan what things to prioritise in our testing, based possible cost-savings with the biggest impact, and after that proceed to the smaller cost reduction opportunities.