gravitational / teleport

The easiest, and most secure way to access and protect all of your infrastructure.
https://goteleport.com
GNU Affero General Public License v3.0
17.28k stars 1.73k forks source link

AWS RDS Guides need updating #25304

Closed tcsc closed 14 hours ago

tcsc commented 1 year ago

Found it harder than necessary to add an RDS Postgres database to a cluster after following the AWS HA Terraform guide, using both the UI enrolment workflow and the RDS Postgres Guide.

To be fair, some of this was my unfamiliarity with RDS, but the big issues I encountered were:

  1. I created the cluster using the HA guide, then tried to add the database post-creation by enabling the DB listener in the terraform and re-applying. This changed the EC2 launch template and security group, but didn't actually reconfigure the running proxies, blocking any connectivity to the DB agent. Adding some notes about how to modify the cluster and/or trigger a Scaling Group refresh to the HA Terraform guide may help. See also: #25259

  2. The recommended policy provided by both the workflow and the was only a subset of the policy I actually needed to make things work. The provided policy only covered the rds-db:connect clause, the final policy I used to get things working is included below.

  3. The guide does not stress that IAM Authentication is optional on the AWS side for Postgres RDS instances. I spent ages trying to get DB access to work with IAM Authentication disabled on the database side, which was never going to work. Note that Teleport will automatically enable IAM Authentication if it has the rds:ModifyDBInstance permissions on the target Database.

  4. When saying you need to attach the policy to the EC2 instance, the workflow doesn't mention that you need to create an IAM role to house the policy before it can be attached. The links included do some of the heavy lifting, but it could probably be more clear. (Also, Teleport will automatically attach such a policy to the role assumed by the EC2 instance running the database agent if it has the iam:(Get|Put)RolePolicy permissions on the EC2 Instance's Role, so it may be better to highlight that rather than the connect policy itself)

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "rds-db:connect",
            "Resource": "arn:aws:rds-db:ap-southeast-2:023992783472:dbuser:db-5QFIC6SRPK5RC3KUGMJHZBAPSY/*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "rds:DescribeDBClusters"
            ],
            "Resource": "arn:aws:rds:ap-southeast-2:023992783472:cluster:aws-ha-db"
        },
        {
            "Effect": "Allow",
            "Action": [
                "rds:ModifyDBInstance"
            ],
            "Resource": "arn:aws:rds:ap-southeast-2:023992783472:db:aws-ha-db"
        },
        {
            "Effect": "Allow",
            "Action": [
                "iam:GetRolePolicy",
                "iam:PutRolePolicy"
            ],
            "Resource": "arn:aws:iam::023992783472:role/aws-ha-db-listener"
        }
    ]
}

Note that the aws-ha-db-listener role was applied to EC2 instance running the Teleport database agent.

I'm still trying to determine whether the Teleport database agent is expected to run without these rights (that is, if the Database, EC2 Role, etc are already in the state Teleport expects). I expect it should (I can totally see a sysadm balking at adding them), but having the alternatives available is probably useful.

GavinFrazar commented 2 months ago

for reference: the relevant docs link is now at https://goteleport.com/docs/deploy-a-cluster/deployments/aws-ha-autoscale-cluster-terraform/

GavinFrazar commented 2 months ago
  1. we can track in that issue
  2. is fixed in latest docs
  3. The guide does not stress that IAM Authentication is optional on the AWS side for Postgres RDS instances. I spent ages trying to get DB access to work with IAM Authentication disabled on the database side, which was never going to work. Note that Teleport will automatically enable IAM Authentication if it has the rds:ModifyDBInstance permissions on the target Database.

The latest docs guide emphasizes this in the pre-reqs, but I'm not sure about the v13 guide since that link doesn't work anymore. I think this point is ok now.

  1. When saying you need to attach the policy to the EC2 instance, the workflow doesn't mention that you need to create an IAM role to house the policy before it can be attached. The links included do some of the heavy lifting, but it could probably be more clear.

I think we can improve here, this is a good point. We should emphasize that they need to create a role. We also need to de-emphasize the docs instructions about optionally setting up the policy on an IAM user - real setups are going to use a role and probably attach it to an ec2 instance with an instance profile. IAM user is only for dev testing, and even then I don't think it makes much sense.

So I'll push a docs PR for this.

GavinFrazar commented 2 months ago

I'm still trying to determine whether the Teleport database agent is expected to run without these rights (that is, if the Database, EC2 Role, etc are already in the state Teleport expects). I expect it should (I can totally see a sysadm balking at adding them), but having the alternatives available is probably useful.

If you're referring to this:

        {
            "Effect": "Allow",
            "Action": [
                "iam:GetRolePolicy",
                "iam:PutRolePolicy"
            ],
            "Resource": "arn:aws:iam::023992783472:role/aws-ha-db-listener"
        }

then yeah, it does not need that. You can configure the permissions yourself. The only permission it absolutely requires is rds-db:connect. I am moving all of our documentation and scripts in that direction, because we get questions about those get/put policy permissions a lot (understandably!) and they require a permissions boundary attached to the role to prevent privilege escalation. The docs updates will explain required/optional iam permissions and what each permission is used for.

GavinFrazar commented 14 hours ago

Circling back to this now.

Item 1 is being tracked in the other ticket and I actually figured that one out already, just need to go fix the terraform to trigger instance refresh.

So item 4 was the only remaining thing outstanding on this ticket.

The RDS guide is now quite clear about creating a role and links to EC2 instance profile setup instructions as well as IRSA for kube deployments, depending on what the user wants:

https://goteleport.com/docs/ver/17.x/enroll-resources/database-access/enroll-aws-databases/rds/#create-an-iam-role-for-teleport

Therefore I'm closing this issue as complete