Closed batpad closed 2 years ago
Thanks for opening this issue, @batpad. I agree this is a priority. I've made some notes about FastAPI deployment requirements in https://github.com/pangeo-forge/pangeo-forge-orchestrator/pull/80#issuecomment-1236002982 (buried a bit far down, under the Documentation bullet point). As noted there, next week I will translate these rough notes into a formal deploy.md
document. From there, we can see what is needed to move that out of the orchestrator
repo. 🥳
For those following along, @yuvipanda has gone ahead and started working on this here: https://github.com/yuvipanda/pangeo-forge-federation .
Currently, it's a WIP that has terraform templates to setup an EKS cluster on AWS, and setup Apache Flink on it via terraform-helm, to be able to receive jobs. Ultimately, it will hold terraform code for all major cloud providers, and allow easy configuration of adding credentials to add more bakeries to the federation.
Speaking to @yuvipanda, it should also have terraform (or something) to provision the orchestrator.
Let's move this discussion to the new repository. Since this repository is created and a lot of work already done, am going to close this issue and open issues in the https://github.com/yuvipanda/pangeo-forge-federation repository.
@batpad I've just moved the issue to this repo, so it's all in one place :)
Following a bit on from yuvipanda/pangeo-forge-runner#19 - we will run Apache Flink on Kubernetes, and @yuvipanda has made a ton of progress getting this to work as a bakery backend in this PR: https://github.com/yuvipanda/pangeo-forge-runner/pull/21
Speaking with @yuvipanda, we should create a repository to manage setting up of the k8s infrastructure on different cloud provider backends, starting with AWS. Ideally, the structure of this repository would allow for easy adding of bakeries to the "federation".
This would contain:
sops
.We can look at these repositories for an idea of how we could structure things (thanks @yuvipanda):
cc @yuvipanda @cisaacstern