pli888 commented 3 years ago

Pull request for issues: #733, #735, #786

This is a pull request for the following functionalities:

Deployment of a GigaDB application with an infrastructure consisting of a docker host on an EC2 instance and a PostgreSQL RDBMS on a RDS instance in a custom AWS Virtual Private Cloud (VPC).
Deployment of a EC2 bastion server within the customised VPC for performing admin tasks on the RDS PostgreSQL RDBMS.
Documentation for spinning up a GigaDB application using the above new AWS infrastructure and about the AWS IAM policy that defines the minimal permissions required to use the RDS service in GigaDB.

Changes to Terraform

The root Terraform file at ops/terraform.tf has been updated with a custom VPC containing public, private and database subnets using the AWS VPC Terraform module. All AWS resources defined as modules in this terraform.tf reside in this custom VPC. For example, the existing docker_host module is located on a public subnet within the VPC. In addition, terraform.tf now contains two new modules:

`rds` module

The rds module defines an AWS RDS instance which provides a PostgreSQL RDBMS located on a VPC database subnet. It uses the AWS Terraform RDS module to configure a PostgreSQL RDBMS version 9.6 running on a t3.micro RDS instance. Its AWS security group allows only internal VPC clients to connect to it.

`ec2_bastion` module

The ec2_bastion module defines an EC2 instance running Centos 8 which provides administrative access to the RDS instance. It is located in a public subnet on the custom VPC and its security group allows connections to it from all public IP addresses. It is expected that bastion server users will destroy this bastion instance when database administrative tasks have been done.

To enable the bastion server to perform its RDBMS tasks, it is provisioned with PostgreSQL 11 client tools from within the bastion-aws-instance.tf.

Changes to Ansible

When the scripts/ansible_init.sh is executed to prepare an environment directory, this script updates the Gitlab gigadb_db_host variable with the host domain of the RDS instance.

The original playbook.yml has been renamed to dockerhost_playbook.yml to signify that it executes against the docker_host EC2 instance. The major change in this playbook is that the references to postgres-preinstall and postgres-postinstall have been deleted since these roles have been removed due to the use of the RDS service.

There is a new bastion_playbook.yml which is executes against the bastion EC2 server. Firstly, the playbook checks it can connect to the RDS instance and then restores a gigadb database on it using the sql/production_like.pgdmp file that is generated as a product of running ./up.sh.

Changes to documentation

SETUP_CI_CD_PIPELINE.md has been updated with RDS specific information.

Procedure for deploying GigaDB application with RDS service

Prerequisites

A SSH key pair created using the AWS console
A sql/production_like.pgdmp file that is created by ./up.sh
An elastic IP, e.g. eip-ape1-staging-Peter-gigadb

Steps

# Go to dir
$ cd <path to>/gigadb-website/ops/infrastructure/envs/staging
# Copy terraform files to staging environment
$ ../../../scripts/tf_init.sh --project gigascience/forks/pli888-gigadb-website --env staging

# Provision with Terraform
$ terraform plan  
$ terraform apply
$ terraform refresh

# Copy ansible files
$ ../../../scripts/ansible_init.sh --env staging

# Provision with ansible
$ ansible-playbook -i ../../inventories dockerhost_playbook.yml
$ ansible-playbook -i ../../inventories bastion_playbook.yml

Executing the bastion playbook could result in an error stating that your password for sudo is required even though become: no has been specified for the two localhost steps. In this case, a work around is:
$ ansible-playbook --ask-become-pass  -i ../../inventories bastion_playbook.yml

Run build_staging step on Gitlab CI/CD pipeline
Run sd_gigadb step on Gitlab CI/CD pipeline

If you browse the GigaDB website on your staging server, you should see that the static web pages are displayed but there are error messages viewing the dataset pages probably due to the dropcontraints and dropindexes database migration steps executed by the gigadb-deploy-jobs.yml. To fix this problem, we restore a gigadb database using production data on the RDS instance:

# In GitLab, click sd_stop_app button in the pipeline
# Assume a backup pgdmp file has been created using files-url-updater tool

# Find out domain name for RDS instance and IP address of bastion server
$ cd <path to>/gigadb-website/ops/infrastructure/envs/staging
$ terraform output
ec2_bastion_public_ip = "18.162.xxx.xxx"
ec2_private_ip = "10.99.x.xx"
ec2_public_ip = "16.162.xxx.xxx"
rds_instance_address = "rds-server-staging.cfkc0cbc20ii.ap-east-1.rds.amazonaws.com"

$ cd <path to>/gigadb-website

# Copy pgdmp file from your local machine to bastion server
$ sftp -i ~/.ssh/id-rsa-aws.pem  centos@18.162.xxx.xxx
sftp> put sql/gigadbv3_20210901_9.6.22.pgdmp
sftp> exit

# Use bastion server to restore database using pgdmp file - will need your gigadb database password
$ ssh -i ~/.ssh/id-rsa-aws.pem  centos@18.162.xxx.xxx "psql -h rds-server-staging.cfkc0cbc20ii.ap-east-1.rds.amazonaws.com -U gigadb -d postgres -c 'drop database gigadb'"
$ ssh -i ~/.ssh/id-rsa-aws.pem  centos@18.162.xxx.xxx "psql -h rds-server-staging.cfkc0cbc20ii.ap-east-1.rds.amazonaws.com -U gigadb -d postgres -c 'create database gigadb owner gigadb'"
$ ssh -i ~/.ssh/id-rsa-aws.pem  centos@18.162.xxx.xxx 'pg_restore -c --if-exists -h rds-server-staging.cfkc0cbc20ii.ap-east-1.rds.amazonaws.com -d gigadb -U gigadb /home/centos/gigadbv3_20210901_9.6.22.pgdmp'

# In GitLab, click sd_start_app button in pipeline

To get terraform to destroy bastion server:

$ terraform destroy -target module.ec2_bastion.aws_instance.bastion

pli888 commented 3 years ago

However there is something weird with this PR: Github says it's made of 800 file changes. That doesn't sound right to me.

This PR is now fixed after rebasing with the latest changes in your fuw-cicd branch.

Regarding the one for the bastion, may I suggest to change the database loading part so we can use the output of convert_production_db_to_latest_ver.sh directly?

I've updated the bastion playbook to search for the latest backup file in the gigadb/app/tools/files-url-updater/sql directory and use this file for restoring the gigadb database on the RDS instance.

pli888 commented 3 years ago

I've rebased this PR with the latest changes in fuw-cicd branch and moved the PostgreSQL client tools installation from bastion-aws-instance.tf into bastion_playbook.yml. The private_subnets have also been commented out in terraform.tf.

rija commented 3 years ago

My terraform inventory list looks like this:

> ../../inventories/terraform-inventory.sh --list | jq
{
  "all": {
    "hosts": [
      "16.162.161.219",
      "16.162.189.6"
    ],
    "vars": {
      "ec2_bastion_public_ip": "16.162.161.219",
      "ec2_private_ip": "10.99.0.241",
      "ec2_public_ip": "16.162.189.6",
      "rds_instance_address": "rds-server-staging.cfkc0cbc20ii.ap-east-1.rds.amazonaws.com"
    }
  },
  "module_ec2_bastion_bastion": [
    "16.162.161.219"
  ],
  "module_ec2_bastion_bastion.0": [
    "16.162.161.219"
  ],
  "module_ec2_dockerhost_docker_host": [
    "16.162.189.6"
  ],
  "module_ec2_dockerhost_docker_host.0": [
    "16.162.189.6"
  ],
  "module_ec2_dockerhost_docker_host_eip": [
    "16.162.189.6"
  ],
  "module_ec2_dockerhost_docker_host_eip.0": [
    "16.162.189.6"
  ],
  "module_ec2_dockerhost_docker_host_eip_assoc": [
    "16.162.189.6"
  ],
  "module_ec2_dockerhost_docker_host_eip_assoc.0": [
    "16.162.189.6"
  ],
  "name_bastion_server_staging": [
    "16.162.161.219"
  ],
  "name_gigadb_server_staging_rija": [
    "16.162.189.6"
  ],
  "system_t3_micro-centos8": [
    "16.162.161.219",
    "16.162.189.6"
  ],
  "type_aws_eip": [
    "16.162.189.6"
  ],
  "type_aws_eip_association": [
    "16.162.189.6"
  ],
  "type_aws_instance": [
    "16.162.161.219",
    "16.162.189.6"
  ]
}

pli888 commented 3 years ago

Hi @pli888, how should the playbooks be run? I've tried 4 different ways:

I use ansible-playbook -i ../../inventories dockerhost_playbook.yml but I think this won't work in your case because dockerhost_playbook.yml won't pick up your EC2 server because of your AWS username at the end of name tag value.

ansible-playbook -i ../../inventories dockerhost_playbook.yml should work if you change the hosts line in dockerhost_playbook.yml to:

hosts: name_gigadb_server_staging*:name_gigadb_server_live*

rija commented 3 years ago

Thanks @pli888, that worked and dockerhost_playbook.yml peformed ok. I've then applied the same change to the bastion_playbook.yml and it seems to have done the trick, however it fails in the play: "Test pg_isready can connect to RDS instance" with error:

TASK [Test pg_isready can connect to RDS instance] ***********************************************************************************************************************
fatal: [16.162.189.6]: FAILED! => {"changed": false, "msg": "no command given", "rc": 256}

rija commented 3 years ago

When running in verbose mode, the output is: https://gist.github.com/rija/565170272e5e40bf904c3270320012aa

rija commented 3 years ago


PLAY RECAP ***************************************************************************************************************************************************************
16.162.189.6               : ok=5    changed=1    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0

rija commented 3 years ago

Looks like PostgreSQL failed to install on bastion:

[...]$ /usr/pgsql-11/bin/pg_isready -h xxxxxxxxxxi.yyyyyy.rds.amazonaws.com
-bash: /usr/pgsql-11/bin/pg_isready: No such file or directory

rija commented 3 years ago

I think there are various problems with the postgresql client install play.

To start with we should use dnf instead of yum everywhere. Also dnf needs sudo, so the become=yes option needs to be present for all calls to dnf module too.

$ git diff
...
   tasks:
     - name: Disable postgresql module in AppStream
       command: dnf -qy module disable postgresql
-      become: true
+      become: yes

     - rpm_key:
         state: present
         key: https://download.postgresql.org/pub/repos/yum/RPM-GPG-KEY-PGDG

     - name: Install PostgreSQL repo
-      yum:
+      become: yes
+      dnf:
         name: https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm
         state: present

Even after that, it still isn't working.

By ssh-ing into the bastion, I can manually install the tools by issuing: dnf install -y postgresql-server

But it's installing the wrong version (10), and from the appstream repository instead of the postgresql repo and it doesn't work when using it in the Ansible play book (it's says there's nothing to do, even when I know the tools are not installed).

If I search for Postgresql package in specific repo of any version dnf --repo pgdg11 search postgresql, the result doesn't seem to contain installable packages.

rija commented 3 years ago

Interestingly, when i replace the call to dnf module in the task for installing postgresql with :

- name: Install PostgreSQL client packages
      command: "dnf -y install postgresql-server"
      become: yes

I got this warning:

 "stdout": "Last metadata expiration check: 1:21:57 ago on Thu 14 Oct 2021 04:24:43 PM UTC.\nPackage postgresql11-server-11.13-1PGDG.rhel8.x86_64 is already installed.\nDependencies resolved.\nNothing to do.\nComplete!",
    "stdout_lines": [
        "Last metadata expiration check: 1:21:57 ago on Thu 14 Oct 2021 04:24:43 PM UTC.",
        "Package postgresql11-server-11.13-1PGDG.rhel8.x86_64 is already installed.",
        "Dependencies resolved.",
        "Nothing to do.",
        "Complete!"
    ]

but I can't see the installed packages on the bastion, and the next task that call pg_isready fails because it cannot find the command either

pli888 commented 3 years ago

@rija Unfortunately, I'm not able to replicate the pg_isready error. I've double checked that dnf -qy module disable postgresql Ansible step removes postgresql from AppStream. Before this step is executed then you will see this on the bastion server:

[centos@ip-10-99-0-74 ~]$ dnf list "postgresql*"
Last metadata expiration check: 0:43:27 ago on Fri 15 Oct 2021 03:24:12 AM UTC.
Available Packages
postgresql.x86_64                                          10.17-1.module_el8.4.0+823+f0dbe136                       appstream  
postgresql-contrib.x86_64                                  10.17-1.module_el8.4.0+823+f0dbe136                       appstream  
postgresql-docs.x86_64                                     10.17-1.module_el8.4.0+823+f0dbe136                       appstream  
postgresql-jdbc.noarch                                     42.2.24-1.rhel8                                           pgdg-common
postgresql-jdbc-javadoc.noarch                             42.2.24-1.rhel8                                           pgdg-common

Then after the postgres AppStream module is disabled then the output is this:

[centos@ip-10-99-0-74 ~]$ dnf list "postgresql*"
Last metadata expiration check: 0:43:05 ago on Fri 15 Oct 2021 03:24:12 AM UTC.
Available Packages
postgresql-jdbc.noarch                                           42.2.24-1.rhel8                                     pgdg-common
postgresql-jdbc-javadoc.noarch                                   42.2.24-1.rhel8                                     pgdg-common
postgresql-odbc.x86_64                                           10.03.0000-2.el8                                    appstream  
postgresql-odbc-tests.x86_64                                     10.03.0000-2.el8                                    appstream  
postgresql-unit10.x86_64                                         7.2-1.rhel8                                         pgdg10

This should allow Postgresql 11 to be installed from the PostgreSQL repository by Ansible so that I see its client tools installed on my bastion server:

[centos@ip-10-99-0-74 ~]$ cd /usr/pgsql-11/
[centos@ip-10-99-0-74 pgsql-11]$ ls
bin  lib  share
[centos@ip-10-99-0-74 pgsql-11]$ cd bin
[centos@ip-10-99-0-74 bin]$ pwd
/usr/pgsql-11/bin
[centos@ip-10-99-0-74 bin]$ ls
clusterdb   dropdb             pg_basebackup  pg_dump     pg_receivewal  pg_test_fsync   pg_waldump  vacuumdb
createdb    dropuser           pgbench        pg_dumpall  pg_restore     pg_test_timing  psql
createuser  pg_archivecleanup  pg_config      pg_isready  pg_rewind      pg_upgrade      reindexdb

This then allows the pg_isready Ansible step to work:

$ ansible-playbook -i ../../inventories bastion_playbook.yml
[WARNING]: Invalid characters were found in group names but not replaced, use -vvvv to see details

PLAY [Restore PostgreSQL database on RDS instance using pg_restore] ************************************************************

TASK [Gathering Facts] *********************************************************************************************************
ok: [18.162.41.222]

TASK [Disable postgresql module in AppStream] **********************************************************************************
[WARNING]: Consider using the dnf module rather than running 'dnf'.  If you need to use command because dnf is insufficient you
can add 'warn: false' to this command task or set 'command_warnings=False' in ansible.cfg to get rid of this message.
changed: [18.162.41.222]

TASK [rpm_key] *****************************************************************************************************************
ok: [18.162.41.222]

TASK [Install PostgreSQL repo] *************************************************************************************************
ok: [18.162.41.222]

TASK [Install PostgreSQL 11 client packages] ***********************************************************************************
changed: [18.162.41.222]

TASK [Test pg_isready can connect to RDS instance] *****************************************************************************
changed: [18.162.41.222]

TASK [debug] *******************************************************************************************************************
ok: [18.162.41.222] => {
    "msg": "rds-server-staging.********20ii.ap-east-1.rds.amazonaws.com:5432 - accepting connections"
}

TASK [Get files in files-url-updater sql folder] *******************************************************************************
fatal: [18.162.41.222]: FAILED! => {"changed": false, "module_stderr": "sudo: a password is required\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}

PLAY RECAP *********************************************************************************************************************
18.162.41.222              : ok=7    changed=3    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0

The error I get with the Get files in files-url-updater sql folder task, I work around by executing:

$ ansible-playbook --ask-become-pass  -i ../../inventories bastion_playbook.yml

I've made the changes you suggested with using dnf instead of yum so there are a few new commits in this PR now.

rija commented 3 years ago

Hi @pli888,

Thanks for looking into it. I've started from scratch after pulling the latest changes.

It still fails at the pg_ready task:

TASK [Install PostgreSQL repo] *******************************************************************************************************************************************
changed: [18.163.46.93]

TASK [Install PostgreSQL 11 client packages] *****************************************************************************************************************************
changed: [18.163.46.93]

TASK [Test pg_isready can connect to RDS instance] ***********************************************************************************************************************
fatal: [18.163.46.93]: FAILED! => {"changed": false, "msg": "no command given", "rc": 256}

All the commands you've shown in your comment, I've run them and I have the same output as you (which is progress as yesterday I didn't get the /usr/pgsql-11/ to exist). So I need to figure out why that commands fails. The verbose output is not much more helpful at first glance. I wonder what the verbose output would say if it were successful:

fatal: [18.163.46.93]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "_raw_params": null,
            "_uses_shell": false,
            "argv": null,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "stdin": null,
            "stdin_add_newline": true,
            "strip_empty_ends": true,
            "warn": true
        }
    },
    "msg": "no command given",
    "rc": 256
}

Manually running the command on bastion works:

[centos@ip-10-99-0-76 ~]$ /usr/pgsql-11/bin/pg_isready -h rds-server-staging.cfkc0cbc20ii.ap-east-1.rds.amazonaws.com
rds-server-staging.cfkc0cbc20ii.ap-east-1.rds.amazonaws.com:5432 - accepting connections

So it seems it is an Ansible thing. My version is:

> ansible-playbook --version
ansible-playbook 2.9.9
  config file = None
  configured module search path = ['/Users/rijamenage/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/Cellar/ansible/2.9.9/libexec/lib/python3.8/site-packages/ansible
  executable location = /usr/local/bin/ansible-playbook
  python version = 3.8.3 (default, May 27 2020, 20:54:22) [Clang 11.0.3 (clang-1103.0.32.59)]

rija commented 3 years ago

Upgrading Ansible fixed the issue with pg_ready for me

> ansible-playbook --version
ansible-playbook [core 2.11.6] 
  config file = None
  configured module search path = ['/Users/rijamenage/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/Cellar/ansible/4.7.0/libexec/lib/python3.9/site-packages/ansible
  ansible collection location = /Users/rijamenage/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/local/bin/ansible-playbook
  python version = 3.9.7 (default, Oct 13 2021, 06:44:56) [Clang 12.0.0 (clang-1200.0.32.29)]
  jinja version = 3.0.2
  libyaml = True

rija commented 3 years ago

The problem I'm running into now is when looking for the output of files-url-updater. It never asked for a password (btw, there's actually no reason it needs sudo for the action in question) and instead fails immediately:

TASK [Get files in files-url-updater sql folder] ****************************************************************************************************************************
task path: /Users/Shared/pli888-gigadb-website/ops/infrastructure/envs/staging/bastion_playbook.yml:35
<localhost> ESTABLISH LOCAL CONNECTION FOR USER: rijamenage
<localhost> EXEC /bin/sh -c 'echo ~rijamenage && sleep 0'
<localhost> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /Users/rijamenage/.ansible/tmp `"&& mkdir "` echo /Users/rijamenage/.ansible/tmp/ansible-tmp-1634297269.269126-85779-103997721710052 `" && echo ansible-tmp-1634297269.269126-85779-103997721710052="` echo /Users/rijamenage/.ansible/tmp/ansible-tmp-1634297269.269126-85779-103997721710052 `" ) && sleep 0'
Using module file /usr/local/Cellar/ansible/4.7.0/libexec/lib/python3.9/site-packages/ansible/modules/find.py
<localhost> PUT /Users/rijamenage/.ansible/tmp/ansible-local-85677_n4udhde/tmpomoc5i5z TO /Users/rijamenage/.ansible/tmp/ansible-tmp-1634297269.269126-85779-103997721710052/AnsiballZ_find.py
<localhost> EXEC /bin/sh -c 'chmod u+x /Users/rijamenage/.ansible/tmp/ansible-tmp-1634297269.269126-85779-103997721710052/ /Users/rijamenage/.ansible/tmp/ansible-tmp-1634297269.269126-85779-103997721710052/AnsiballZ_find.py && sleep 0'
<localhost> EXEC /bin/sh -c 'sudo -H -S -n  -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-bztqrexbgfmvezafvxbdzyayszuqvdme ; /usr/local/Cellar/ansible/4.7.0/libexec/bin/python3.9 /Users/rijamenage/.ansible/tmp/ansible-tmp-1634297269.269126-85779-103997721710052/AnsiballZ_find.py'"'"' && sleep 0'
<localhost> EXEC /bin/sh -c 'rm -f -r /Users/rijamenage/.ansible/tmp/ansible-tmp-1634297269.269126-85779-103997721710052/ > /dev/null 2>&1 && sleep 0'
fatal: [18.163.46.93 -> localhost]: FAILED! => {
    "changed": false,
    "module_stderr": "sudo: a password is required\n",
    "module_stdout": "",
    "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
    "rc": 1
}

rija commented 3 years ago

I think the issue is that these two tasks are local tasks in the midst of a remote playbook

I've tried many variations on delegate_to and local_action and become with no avail.

I could get those two tasks to succeed (ant the rest of the bastion playbook) by running the entire playbook with sudo, which doesn't feel right.

An alternative approach would be to get rid of these two local tasks and prompt the user in tf_init.sh for a backup file path, which will have the backup file location written to ansible.properties and then we add the corresponding variable in ops/infrastructures/inventories/hosts file in the [all:vars] section.

Added benefit that way is the user has more flexibility to choose a backup file (e.g: because the latest source backup is corrupted).

pli888 commented 3 years ago

An alternative approach would be to get rid of these two local tasks and prompt the user in tf_init.sh for a backup file path, which will have the backup file location written to ansible.properties and then we add the corresponding variable in ops/infrastructures/inventories/hosts file in the [all:vars] section.

The above has been implemented in the 1471b9c commit now for this PR. So now, on running tf_init.sh, you will see the following prompt below (with my response):

You need to specify a backup file created by the files-url-updater tool: gigadbv3_20210929_v9.3.25.backup

Finally, I've added the AWS username to the end of the name tag of the bastion server. This helps to distinguish multiple bastion servers and I'm sure it's not allowed for EC2 servers to have identical name tags.

rija commented 3 years ago

Hi @pli888, I've got new error that happened during terraform apply:

module.ec2_dockerhost.aws_eip_association.docker_host_eip_assoc: Still creating... [10s elapsed]
module.ec2_dockerhost.aws_eip_association.docker_host_eip_assoc: Creation complete after 12s [id=eipassoc-0fa7d25e87ed71876]

Error: Error creating DB Instance: DBInstanceAlreadyExists: DB instance already exists
    status code: 400, request id: 266df3e2-5cd6-4ceb-a3ba-550d6f74fb91

  on .terraform/modules/rds.db/modules/db_instance/main.tf line 20, in resource "aws_db_instance" "this":
  20: resource "aws_db_instance" "this" {

The error is thrown by Terraform public module for aws-rds. I've checked, I've no existing RDS deployed already.

could it be tag clash (I noticed you have an RDS instance named rds-server-staging) and that tag will be generated for me in rds-instance.tf

module "db" {
  source = "terraform-aws-modules/rds/aws"

  # Only lowercase alphanumeric characters and hyphens allowed in "identifier"
  identifier = "rds-server-${var.deployment_target}"

the fix is probably to add the Iam role username as suffix there.

rija commented 3 years ago

Hi @pli888, the following patch made it work for me:

diff --git a/ops/infrastructure/getIAMUserNameToJSON.sh b/ops/infrastructure/getIAMUserNameToJSON.sh
index 48fa64f4d..df8c3dbc9 100755
--- a/ops/infrastructure/getIAMUserNameToJSON.sh
+++ b/ops/infrastructure/getIAMUserNameToJSON.sh
@@ -6,5 +6,5 @@
 # the output has to be valid JSON

 set -e
-userName=$(aws sts get-caller-identity --output text --query Arn | cut -d"/" -f2)
+userName=$(aws sts get-caller-identity --output text --query Arn | cut -d"/" -f2 | tr '[:upper:]' '[:lower:]')
 jq -n --arg userName "$userName" '{"userName":$userName}'
\ No newline at end of file
diff --git a/ops/infrastructure/modules/rds-instance/rds-instance.tf b/ops/infrastructure/modules/rds-instance/rds-instance.tf
index e62f11e60..6b2a101f7 100644
--- a/ops/infrastructure/modules/rds-instance/rds-instance.tf
+++ b/ops/infrastructure/modules/rds-instance/rds-instance.tf
@@ -21,7 +21,7 @@ module "db" {
   source = "terraform-aws-modules/rds/aws"

   # Only lowercase alphanumeric characters and hyphens allowed in "identifier"
-  identifier = "rds-server-${var.deployment_target}"
+  identifier = "rds-server-${var.deployment_target}-${var.owner}"

   create_db_option_group    = false
   create_db_parameter_group = false

rija commented 3 years ago

@pli888,

In tf_init.sh, it's better to write backup_file's value in .init_env_vars, and then in ansible_init.sh, write backup_file's value to ansible.properties like all the other variables,

instead of appending it to ansible.properties from tf_init.sh.

Otherwise, when we run tf_init.sh multiple times (e.g: because we are debugging something), the variables will be appended multiple time in ansible.properties and that makes Ansible crash because it doesn't like duplicate values in ansible.properties

Additionally, by using .init_env_vars, we avoid being prompted for the backup file location every time we call tf_init.sh if that location hasn't changed.

pli888 commented 3 years ago

@rija I've made the changes you suggested now. The RDS server now has a name tag which ends with the user's IAM username. The backup_file variable is now first written into . init_env_vars by tf_init.sh and then by ansible_init.sh into the ansible.properties file.

pli888 commented 3 years ago

@rija I noticed the change in getIAMUserNameToJSON.sh:

-userName=$(aws sts get-caller-identity --output text --query Arn | cut -d"/" -f2)
+userName=$(aws sts get-caller-identity --output text --query Arn | cut -d"/" -f2 | tr '[:upper:]' '[:lower:]')

I was wondering when you execute terraform destroy, do you get errors with deleting your AWS resources?

rija commented 3 years ago

@pli888, yes,

I figured the reason i think: it's because the AWS policies check on the owner tag in a case sensitive way.

rija commented 3 years ago

I've updated the policies to use StringEqualsIgnoreCase when StringEquals was used to check on ${aws:username} and I can now destroy all my resources

rija commented 3 years ago

Hi @pli888,

minor issue I just found in tf_init.sh:

93      echo "backup_file=../../../../gigadb/app/tools/files-url-updater/sql/$backup_file" >> .init_env_vars

It's better to ask the user for the full path instead of just the file name, otherwise when we run tf_init.sh multiple times,

$backup_file's value will be modified with the same prefix added to the already saved full path, which leads to an increasingly long and incorrect value.

pli888 commented 3 years ago

@rija Ok, I've made a note to make changes to the code to prompt user for the full path to the backup file when I continue work on my rds restore from snapshot and backup branch. I will also update policy-ec2.md and policy-rds.md too.

rija commented 3 years ago

@pli888 sounds good to me @kencho51 Let me know when this PR is working for you and I'll then merge it to fuw-cicd

rija / gigadb-website

Use AWS RDS service to provide GigaDB with PostgreSQL database #194

Pull request for issues: #733, #735, #786

Changes to Terraform

`rds` module

`ec2_bastion` module

Changes to Ansible

Changes to documentation

Procedure for deploying GigaDB application with RDS service

Prerequisites

Steps

rija / gigadb-website

Use AWS RDS service to provide GigaDB with PostgreSQL database #194

Pull request for issues: #733, #735, #786

Changes to Terraform

rds module

ec2_bastion module

Changes to Ansible

Changes to documentation

Procedure for deploying GigaDB application with RDS service

Prerequisites

Steps

`rds` module

`ec2_bastion` module