Closed chuckbelisle closed 1 year ago
@jacek-dudek, please update this ticket with any research or elaboration that was made.
Carried out some learning activities on Terraform, Docker, Kubernetes, OpenM++ web service.
Created a Dockerfile for a basic containerized deployment of OpenM++ and for running its web service on start up.
Uploaded Dockerfile to Docker Hub registry.
Created a basic Kubernetes cluster deployment on Azure using Terraform.
Created manifest files for the OpenM++ container and a load balancer to publish the application.
Confirmed that the basic setup runs successfully.
Next steps
openmpp
and clone jacek-dudek/openmpp-on-k8s
as a baselineClarifying project direction and deliverables: We decided to work towards implementing a cloud offering that has feature parity with the existing microsimulation web service operated by the OpenM++ team on GCP.
Progress made: Did some more background reading of Kubernetes documentation. Identified Kubernetes objects that will be needed in subsequent iterations of the service. Located a github project that appears to be an implementation of OpenMPI on Kubernetes. URL for project: https://github.com/everpeace/kube-openmpi
This should enable us to host the OM++ web service on aaw-dev as a starting iteration.
To be further elaborated over the duration of Iteration 0
You can find my notes on Kubeflow's integrated MPI training operator that I used for my POC here: https://github.com/StatCan/aaw-private/issues/95, the everpeace/kube-openmpi project was evaluated but it was created 5 years ago and has not been maintained vs the kubeflow training operators which are in active development.
Regarding provisioning a separate node pool, are there any project requirements that would need this yet? With the MPI training operator it would be as simple as labeling the manifest with the node type to use, but I think this should come as a special request from specific projects only after this have hit a limitation with our existing nodes.
Continuing this work in https://github.com/StatCan/openmpp/issues/3
A system that provisions the OpenM++ framework into some type of cloud-based deployment, either via VM or containerized.