bcgov / nr-forests-access-management

Authorization solution for BC natural resource sector
Apache License 2.0
8 stars 2 forks source link

SPIKE: Find out if Aurora or RDS can export to PostgreSQL native format #33

Closed ConradBoydElliottGustafson closed 2 years ago

ConradBoydElliottGustafson commented 2 years ago

We want to know if we are "locking in" to a particular flavor of postres or if we could move our data to something else (Azure, on-prem, etc.)

  1. Create an aurora db, put some data in it, export, import
  2. Same for RDS

AC: Try to export from Aurora and import into a local PostgreSQL db Try to export from RDS PostgreSQL and import into a local db Report the results and discuss with PO if it impacts our architecture

rparfig commented 2 years ago
  1. Used famdb (Dev) as a source for testing aurora db backups
rparfig commented 2 years ago
rparfig commented 2 years ago

Errors: postgres@CA-L3H3ZPN2://mnt/c/Clients/FSA/git$ psql --host=rds-temp-postgres.cp9oqzf51oiq.ca-central-1.rds.amazonaws.com --port=5432 --username=postgres --password --dbname=rds-postgres Password:

psql: error: could not connect to server: Connection timed out Is the server running on host "rds-temp-postgres.cp9oqzf51oiq.ca-central-1.rds.amazonaws.com" (10.12.53.118) and accepting TCP/IP connections on port 5432?

basilv commented 2 years ago

While RDS instance isn't accessible externally, AWS RDS allows one to do manual or automatic snapshots and restore as desired from snapshots to new or existing database instance. I'm not sure if the snapshot can be exported as a file (to e.g. use locally), that would be worth testing.

rparfig commented 2 years ago

@basilv : That is correct. Through my testing, those snapshots are Aurora RDS snapshots and only usable by RDS. As an alternative, you can save the snapshot to S3, but even if you download it from S3, it will be a Parquet format file, not a postgres-compatible backup file.

basilv commented 2 years ago

Sorry, thought you were talking above about managed RDS rather than Aurora.

rparfig commented 2 years ago

Update: Conversations with the AWS Team confirmed our findings. Aurora snapshots are not to be used for restoring non-Aurora Postgres DBs.

Alternatives:

Investigating DMS:

Richard Pardo-Figueroa @rpardofigueroa 8:31 AM hello team: I am facing this warning when trying to create a Replication instance using AWS DMS (Database Migration Service):

DMS requires access permissions to manage your VPC resources. By clicking Create replication instance, you grant permission for DMS to create a role that has this access.

followed by this error:

User: arn:aws:sts::521834415778:assumed-role/BCGOV_WORKLOAD_developer_umafubc9/rpardofi@idir is not authorized to perform: iam:GetRole on resource: role dms-vpc-role because no identity-based policy allows the iam:GetRole action

Could someone provide me with some assistance on resolving this issue? Is it perhaps because DMS does not seem to be available as a AWS Service, approved by the Pathfinder team? or I need elevated privileges to allow the DMS service to create a role?

ArogeG commented 2 years ago

@rparfig is investigating why DMS can't see the FAM DB. consider allowing Security groups, inbound and outbound flow from the DB.

basilv commented 2 years ago

DMS would need to query the FAM DB so this means the FAM DB security group inbound flow needs to allow access from DMS, while DMS security group outbound flow needs to allow access to FAM. But usually outbound flows are unlimited, so it is likely just the inbound flow for FAM DB that needs to be configured.

basilv commented 2 years ago

Had a session with Richard and got things working.

Learnings from the AWS configuration - there's a set of initial security groups created as part of the initial environment. The database should be placed in the data_sg security group, and the app in the app_sg security group, which then allows network access via the inbound rule in data_sg allowing traffic from resources in the app_sg security group.

To get DMS to work, it needed to be placed in the app_sg security group.

The existing FAM aurora database has a dedicated security group created by terraform that likely isn't needed.

My access in AWS (as a developer role) did not allow me to even see any of the RDS database instances. I could see all the security groups, but was unable to see the details of the terraform-created security group (e.g. couldn't see the inbound rules, while I could see the rules of the initially created groups like data-sg). Richard with his admin role was able to see all of this.

rparfig commented 2 years ago

Testing was successfully completed using DMS, copying the FAM database to a PostgreSQL DB also hosted in RDS AWS.

Challenge:

Action:

Given that we have proven that DMS can be used to copy the structures and data of an RDS database to a vanilla progress, we can assume that this migration method will work. However, we will try the following:

rparfig commented 2 years ago

Alternative suggested by @ConradBoydElliottGustafson :

Actions:

rparfig commented 2 years ago
rparfig commented 2 years ago

Reached out to the Amazon team for assistance, explaining the issue with not being able to run the shell in the newly created EC2 instance. Awaiting response

rparfig commented 2 years ago

image.png

$ sudo amazon-linux-extras install epel $ sudo tee /etc/yum.repos.d/pgdg.repo<<EOF [pgdg13] name=PostgreSQL 13 for RHEL/CentOS 7 - x86_64 baseurl=http://download.postgresql.org/pub/repos/yum/13/redhat/rhel-7-x86_64 enabled=1 gpgcheck=0 EOF $ sudo yum install postgresql13 postgresql13-server $ sudo /usr/pgsql-13/bin/postgresql-13-setup initdb $ systemctl status postgresql-13 $ sudo su - postgres $ psql postgres=# create database famdb; CREATE DATABASE postgres=# create schema app_fam; CREATE SCHEMA

rparfig commented 2 years ago

Firewall seems to allow all traffic from subnet, but facing this issue:

Test Endpoint failed: Application-Status: 1020912, Application-Message: Cannot connect to ODBC provider Network error has occurred, Application-Detailed-Message: RetCode: SQL_ERROR SqlState: 08001 NativeError: 101 Message: [unixODBC]could not connect to server: Connection refused Is the server running on host "ip-10-12-102-245.ca-central-1.compute.internal" (10.12.102.245) and accepting TCP/IP connections on port 5432?

rparfig commented 2 years ago

Reached out to the OCIO folks as well as to the AWS team. No response from OCIO, waiting for AWS to show availability or book something for us, as Brent is quite booked up.

Started from scratch, after realizing that there is a setting that I was unintentionally ignoring, that allowed me to create an EC2 outside of the Application subnet.

DMS resides in the App group, so it should be able to connect to anything on the App subnet, but 5432 is not open in that subnet, so the first EC2 was expected to fail.

basilv commented 2 years ago

You can always add a new security group with a custom inbound rule to allow the traffic from DMS. But I agree that the second scenario should have worked.

rparfig commented 2 years ago

Tried a different approach. Using the EC2 running on the APP Security Group, I tried to run pg_dump against the RDS Aurora Postgres DB and it worked:

-bash-4.2$ pg_dump -h fam-aurora-db-postgres.cluster-cp9oqzf51oiq.ca-central-1.rds.amazonaws.com -p 5432 -Fc -U sysadmin famdb > app_fam_db.dump
Password:
-bash-4.2$ ls -ltr
total 52
drwx------ 4 postgres postgres    51 Jul 27 22:14 13
-rw-r--r-- 1 postgres postgres 51075 Aug  9 00:06 app_fam_db.dump

-bash-4.2$ psql psql (13.7) Type "help" for help.


postgres=# create user fam_proxy_api;
CREATE ROLE
postgres=# create database app_fam_restored;
CREATE DATABASE
postgres=# \q
-bash-4.2$ pg_restore -d app_fam_restored app_fam_db.dump
pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 5; 2615 118795 SCHEMA app_fam sysadmin
pg_restore: error: could not execute query: ERROR:  role "sysadmin" does not exist
Command was: ALTER SCHEMA app_fam OWNER TO sysadmin;
-bash-4.2$
-bash-4.2$ psql -d app_fam_restored
psql (13.7)
Type "help" for help.

app_fam_restored=# \d app_fam.fam_application;
                                          Table "app_fam.fam_application"
         Column          |              Type              | Collation | Nullable |             Default
-------------------------+--------------------------------+-----------+----------+----------------------------------
 application_id          | bigint                         |           | not null | generated by default as identity
 application_name        | character varying(100)         |           | not null |
 application_description | character varying(200)         |           | not null |
 create_user             | character varying(30)          |           | not null |
 create_date             | timestamp(6) without time zone |           | not null | CURRENT_DATE
 update_user             | character varying(30)          |           |          |
 update_date             | timestamp(6) without time zone |           |          | CURRENT_DATE
Indexes:
    "fam_app_pk" PRIMARY KEY, btree (application_id)
    "fam_app_name_uk" UNIQUE CONSTRAINT, btree (application_name)
Referenced by:
    TABLE "app_fam.fam_application_group_xref" CONSTRAINT "reffam_application20" FOREIGN KEY (application_id) REFERENCES app_fam.fam_application(application_id)
    TABLE "app_fam.fam_role" CONSTRAINT "reffam_application22" FOREIGN KEY (application_id) REFERENCES app_fam.fam_application(application_id)
    TABLE "app_fam.fam_application_client" CONSTRAINT "reffam_application31" FOREIGN KEY (application_id) REFERENCES app_fam.fam_application(application_id)

app_fam_restored=# select * from app_fam.fam_application;
 application_id | application_name |  application_description  |  create_user  |     create_date     |  update_user  |     update_date
----------------+------------------+---------------------------+---------------+---------------------+---------------+---------------------
              1 | fam              | Forests Access Management | fam_proxy_api | 2022-07-20 00:00:00 | fam_proxy_api | 2022-07-20 00:00:00
              2 | fom              | Forest Operations Map     | fam_proxy_api | 2022-07-20 00:00:00 | fam_proxy_api | 2022-07-20 00:00:00
(2 rows)

app_fam_restored=#
rparfig commented 2 years ago

Meeting with the Pathfinder team scheduled for Monday Aug 15th@3PM, to follow up on configuration issues on our name plate or perhaps in the internal configuration of our AWS Tenancy and the security guardrails set in place.

fadekemiatitebi commented 2 years ago

@rparfig please can you include me in this meeting. In the meantime, are we able to stop the two EC2 instances if we are not using them until next week after discussions with the pathfinder team.

rparfig commented 2 years ago

@fadekemiatitebi : I was careful on choosing the Amazon Free Tier for these EC2 instances, so we should not be charged for these services. However, I have shutdown both EC2s. I have also forwarded you the invite for the meeting with the Pathfinder team on Monday

fadekemiatitebi commented 2 years ago

Thanks @rparfig, It looks like there are some EC2 charges, although not significant. If you still need the instance, you can keep it running. Otherwise stopping it may not be a bad idea since we can always restart it.

rparfig commented 2 years ago

Issue has been solved, with the assistance of the Pathfinder team, during the session held on Aug 15, 2022:

Outstanding / pending for decision:

gormless87 commented 2 years ago

@rparfig - 'prescind' - thanks for the vocabulary expansion