Open owlleg6 opened 2 years ago
After further investigation, it looks like the main purpose of this task is to omit and replace IAM Users usage.
On deploy, our infrastructure creates 3 IAM Users and 3 AWS IAM Access Keys for each of them:
In case of _jupyternotebook and mlflow, it would be relatively easy to replace them with IAM Roles, because none of theirs keys is used in further actions during deployment.
However, there is a much more complex case with controller, its keys are used in 2 connections:
These connections are used by Pods for training, packaging and etc.
From perspective of ops team, its is possible to start using Service Account with attached Annotations, that are linked to the IAM Role with strict policies for S3 and ECR access. As for now, ODAHU can not accept Annotations and attach it to the Pods.
We need to create logic that involves both developer and ops team for this issue, due to its complex nature.
Main idea: In case of forwarding Pod Annotations (instead of AWS IAM Access Keys), ODAHU should attach it to certain Pods thus allowing access to ECR and S3. Note, that connection is still required as entity, it just use Annotations instead of Access Keys, for example.
Since AWS Organization, that administer AWS account for ODAHU project, prohibited automated creation of IAM Users during cluster deployment, we need to develop solution which uses IAM Roles instead. IAM Roles is a common best practice in this case, so we need to update our infrastructure to align with it.