Find out if Postgres foreign-data wrapper will work for cross DB requests

jon-betts commented 2 years ago

We believe that the Postgres foreign-data wrapper (FDW) module will be enough to satisfy our need to make cross DB queries.

In order for this to work we need to validate our assumptions.

Assumptions

[x] We can install the FDW simply by executing an SQL query
[x] We can think of a way to execute this and have it install the correct connection details into the live DBs
[x] These connection details will cascade down to the read-replicas
[x] If a target DB is down, then nothing terrible happens
- [ ] Is it different if there are network problems (i.e security group goes away)?
- [ ] Is it different if the connection is allowed but the DB is just not up?
- [ ] What about bad connection details?
[x] We can make the FDW connection details specific and limited:
- [x] To be read only
- [x] To be over specific tables / columns
- [x] Can we do write only? No
[x] The performance of joins across the tables will be sufficient for us work
- The performance against the raw data does not need to be excellent
- If a roll-up type table is needed to get acceptable performance, that's acceptable
- [x] Is this different if the DBs are co-hosted vs not co-hosted?
- [x] We can handle changes in the foreign schema gracefully enough (for internal reports, who can probably allow a bit of down time)

Security concerns in region

Let's have a think that this doesn't cause any problems in region

Tasks

In https://github.com/hypothesis/lms/issues/4512 we got a read-write instance of h
[x] Get a read-write copy of recent LMS data
[x] Get a read replica of this LMS data
[x] Try connecting the two and answering the above questions

marcospri commented 2 years ago

I reckon for testing this we should add a read replica in the mix to also test the FDW details get replicate down to read replicas.

jon-betts commented 2 years ago

Some initial notes:

Lists 4 steps:

Install the postgres_fdw extension using CREATE EXTENSION.

Create a foreign server object, using CREATE SERVER, to represent each remote database you want to connect to. Specify connection information, except user and password, as options of the server object.

Create a user mapping, using CREATE USER MAPPING, for each database user you want to allow to access each foreign server. Specify the remote user name and password to use as user and password options of the user mapping.

Create a foreign table, using CREATE FOREIGN TABLE or IMPORT FOREIGN SCHEMA, for each remote table you want to access. The columns of the foreign table must match the referenced remote table. You can, however, use table and/or column names different from the remote table's, if you specify the correct remote names as options of the foreign table object.

1. Install `postgres_fdw`

CREATE EXTENSION IF NOT EXISTS postgres_fdw;

This needs to be a super user. Potential issue here?

You can debug which plugins are available with:

SELECT * FROM pg_available_extensions where name='postgres_fdw';

This is available in H and LMS in production.

2. Create a server

When you create a server you don't specify the username or password, that's attached to the user:

(specify these in a user mapping, instead, or use a service file)

Service files probably aren't an option, as I think they are on the DB machine itself.

The SQL to create in the first place looks a bit like this:

CREATE SERVER IF NOT EXISTS local_server_name 
    FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (
        host 'hostname',
        dbname 'dbname',
        port '5432'
    )

I think we can alter the settings here like this:

ALTER SERVER local_server_name 
    OPTIONS (
    SET host 'new_host',
    SET port 'new_port'
    )

3. Create a user mapping

CREATE USER MAPPING 
    IF NOT EXISTS
    FOR local_reporting_user
    SERVER local_server_name
    OPTIONS (
        user 'remote_reporting_user',
        password 'remote_password'
    )

This will need to exist on both sides.

We can alter the user's settings with something like:

ALTER USER MAPPING 
    FOR local_reporting_user 
    SERVER local_server_name 
    OPTIONS (
        SET user 'new_user',
        SET password 'new_password'
    )

4. Creating foreign tables

You appear to be able to specify every column, or import directly from a remote table. It's seems like the name you choose is the link between the two.

Specifying every column:

CREATE FOREIGN TABLE films (
    code        char(5) NOT NULL,
    title       varchar(40) NOT NULL,
   ...
)
SERVER film_server;

This kind of sucks as we'd have to duplicate the entire schema we are targetting. On the plus side we can pick and choose columns.

Slightly better if we can see all columns is to not specify the exact schema ourselves, but pull it in:

IMPORT FOREIGN SCHEMA film_schema
    LIMIT TO (films)
    FROM film_server
    INTO local_schema

This will automatically copy it over.

What about table schema changes? It seems you have to blow it away and re-import, or manually tweak it, which seems like a pain. If we have calculated tables or views based on this, we are probably going to have to re-calculate them again. This would be a plus point for live, rather than materialized views.

We probably want to look at setting updatable=False on our tables. If our user is read-only we will fail anyway, but this means the error would be raised on our side before we try.

indigobravo commented 2 years ago

Inter-Region

I have been looking into modifications needed to our current topology to enable an inter-region connection. Leaving compliance to one side for the moment. A solution could be found using VPC Peering. A technology that enables IP routing between subnets hosted in different VPCs.

Here is a rough diagram detailing the change from the perspective of a Hypothesis region.

marcospri commented 2 years ago

TLDR summary:

FDW behaves in the ways you'd image it does
It doesn't seem to have a big impact in performance. At least in our current DB setup.
My biggest concern is around using the replica to run BI queries (which tend to be on slow side).

We should consider querying the primary or, accepting that there's a limit on how long queries can take and that limit is different depending on the activity on the primary server. I find that limit hard to reason about and the steps to increase/control it all seem to take a toll on the primary's performance which might be one the reason to pick the replica in the first place.

Some testing python code with comments for findings around postgres FDW.

Setting up connections to H and LMS.

import os
import uuid
import subprocess
import psycopg2

databases = {
    "h": {
        "host": os.environ.get("H_DB_HOST", "localhost"),
        "port": os.environ.get("H_DB_PORT", 5432),
        "name": os.environ.get("H_DB_NAME", "postgres"),
        "user": os.environ.get("H_DB_USER", "postgres"),
        "password": os.environ.get("H_DB_PASS", "postgres"),
    },
    "lms": {
        "host": os.environ.get("LMS_DB_HOST", "localhost"),
        "port": os.environ.get("LMS_DB_PORT", 5433),
        "name": os.environ.get("LMS_DB_NAME", "postgres"),
        "user": os.environ.get("LMS_DB_USER", "postgres"),
        "password": os.environ.get("LMS_DB_PASS", "postgres"),
    },
}

def db_connection(host, port, name, user, password):
    return psycopg2.connect(
        host=host, port=port, database=name, user=user, password=password
    )

h_conn = db_connection(**databases["h"])
lms_conn = db_connection(**databases["lms"])

h_cur = h_conn.cursor()
lms_cur = lms_conn.cursor()

User management

We won't use the existing user in the H database but just one that has access to only selected tables.

Here we are creating this restricted user in H first.

FDW_USER = "fdw_user"
FDW_PASSWORD = "fdw_password"
try:
    h_cur.execute(f"CREATE USER {FDW_USER} WITH PASSWORD %s", (FDW_PASSWORD,))
except psycopg2.errors.DuplicateObject:
    print(f"Role {FDW_USER} already exists.")
    h_conn.rollback()

We are only granting this user access to a couple of tables.

H_TABLES = ["annotation", "document"]
for table_name in H_TABLES:
    h_cur.execute(f"GRANT SELECT ON {table_name} TO {FDW_USER}")
h_conn.commit()

Granting could for example at the schema level (eg all of an hypotheical reporting schema)

In production we'd handle this user as we do any other RDS users, probably outside the application code, ideally still in some form of "code", ie we can make a ansible/terraform PR to add extra permissions.

Initial FDW setup

These statements require RDS's superuser permissions.

They only need to be run once and they don't contain any parameters/setting.

We should run this as part of the RDS initialization process (if there's a script to do that) and maybe just manually for existing DBs (LMS).

Other alternatives could involve giving existing users super user privileges but that's to be avoided.

https://www.postgresql.org/docs/current/sql-createextension.html https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.PostgreSQL.CommonDBATasks.html

Create the extension as admin and grant its usage to the user that the LMS app runs with.

CREATE EXTENSION IF NOT EXISTS postgres_fdw;
GRANT USAGE ON FOREIGN DATA WRAPPER postgres_fdw TO "lms-prod";

Configuration of FDW on LMS

All the following steps could be run on app init like:

https://github.com/hypothesis/h/blob/main/h/db/__init__.py

Point FDW to H (from LMS), here we'd take the connection details from env variables (the location of the H database).

https://www.postgresql.org/docs/current/sql-createserver.html

FDW_NAME = "h"
lms_cur.execute(
    f"CREATE SERVER IF NOT EXISTS {FDW_NAME} FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname %s, host %s, port %s)",
    (databases["h"]["name"], databases["h"]["host"], str(databases["h"]["port"])),
)

Now create a user mapping, this makes using a local user in LMS behave like the remote user in H.

Again this values could be read from environment variables.

https://www.postgresql.org/docs/current/sql-createusermapping.html

lms_cur.execute(
    f"""
CREATE USER MAPPING IF NOT EXISTS FOR {databases['lms']['user']}
        SERVER {FDW_NAME}
        OPTIONS (user %s, password %s);
""",
    (FDW_USER, FDW_PASSWORD),
)

Create a schema for the fdw tables, not necessary but tidier

lms_cur.execute("CREATE SCHEMA IF NOT EXISTS h;")

Now import the tables we are interested into the new schema

for table_name in H_TABLES:
    try:
        # Import a selection of tables into the recently created schema
        lms_cur.execute(
            f"""
        IMPORT FOREIGN SCHEMA  "public" LIMIT TO ({table_name}) FROM SERVER {FDW_NAME} INTO h;
        """
        )
        lms_conn.commit()
    except psycopg2.errors.DuplicateTable:
        print(f"H table {table_name} already imported.")
        lms_conn.rollback()

after that we can query H data normally, we just need to specify the schema.

lms_cur.execute("SELECT count(id) from h.annotation")
print("Querying LMS for H's anno count:", lms_cur.fetchone()[0])

Schema changes handling

We'll simulate changes downstream creating a new column directly in H (not using the FDW)

h_cur.execute("ALTER TABLE annotation ADD COLUMN IF NOT EXISTS new_column TEXT")
h_conn.commit()

Trying to just query the new column will fail

try:
    lms_cur.execute("SELECT new_column from h.annotation")
except psycopg2.errors.UndefinedColumn:
    lms_conn.rollback()

We can still make queries not affected by the change

lms_cur.execute("SELECT count(id) from h.annotation")
print("Querying LMS for H's anno count:", lms_cur.fetchone()[0])

The remote table can be altered to match the source one

lms_cur.execute(
    "ALTER FOREIGN TABLE h.annotation ADD COLUMN IF NOT EXISTS new_column TEXT"
)
lms_conn.commit()

And now we can query the new column

lms_cur.execute("SELECT new_column from h.annotation")

Alter the source table again, now remove the new column to go back to normal

h_cur.execute("ALTER TABLE annotation DROP COLUMN new_column")
h_conn.commit()

Dropping the foreign table and re-importing brings the latest schema in.

The same is true at the schema level, cascade deleting the schema where foreign data tables live and then re-importing them will update the structure of all tables.

lms_cur.execute("DROP SCHEMA IF EXISTS h CASCADE")
# Create a schema for the fdw tables, not necessary but tidier
lms_cur.execute("CREATE SCHEMA IF NOT EXISTS h;")

for table_name in H_TABLES:
    try:
        # Import a selection of tables into the recently created schema
        lms_cur.execute(
            f"""
        IMPORT FOREIGN SCHEMA  "public" LIMIT TO ({table_name}) FROM SERVER {FDW_NAME} INTO h;
        """
        )
        lms_conn.commit()
    except psycopg2.errors.DuplicateTable:
        print(f"H table {table_name} already imported.")
        lms_conn.rollback()

This suggests that we should have different approaches available to manage schema changes:

A task we can run on demand that "refresh everything"

This might be slow as some materialized views will need to run again. If in practice is "fast enough" it might be all we need.

It could be also useful to run locally to detect if any changes in the H/LMS schema break any of the new views.

A task to run arbitrary SQL

We could setup alembic (where all migration are SQL based, not model based) or a similar mechanism to surgically alter tables/views, delete and recreate specific ones for when the "recreate the world" approach is not suitable.

Write operations

Write operations are possible through FDW

lms_cur.execute(
    """INSERT INTO h.annotation (id, created, updated, userid, groupid, text, text_rendered, tags, shared, target_uri, target_uri_normalized, target_selectors, "references", extra, deleted, document_id)
    SELECT %s, %s,  %s, userid, groupid, text, text_rendered, tags, shared, target_uri, target_uri_normalized, target_selectors, "references", extra, deleted, document_id from h.annotation  order by created limit 1""",
    (
        uuid.uuid4().hex,
        "2020-10-13",
        "2020-10-13",
    ),
)

We can make the whole server updatable=False and also change the setting at individual tables

lms_cur.execute("ALTER FOREIGN TABLE h.annotation OPTIONS (ADD updatable 'false')")
lms_conn.commit()

try:
    lms_cur.execute(
        """INSERT INTO h.annotation (id, created, updated, userid, groupid, text, text_rendered, tags, shared, target_uri, target_uri_normalized, target_selectors, "references", extra, deleted, document_id)
        SELECT %s, %s,  %s, userid, groupid, text, text_rendered, tags, shared, target_uri, target_uri_normalized, target_selectors, "references", extra, deleted, document_id from h.annotation  order by created limit 1""",
        (
            uuid.uuid4().hex,
            "2020-10-13",
            "2020-10-13",
        ),
    )
except psycopg2.errors.ObjectNotInPrerequisiteState:
    pass
    lms_conn.rollback()

In a production environment we should set the whole server as updatable=False and then change the setting in individual tables.

Failures modes

Create a Materialized View (MT) from a foreign table (FT)

lms_cur.execute(
    """CREATE MATERIALIZED VIEW fdw_mv_anno 
    AS SELECT id, created, updated FROM h.annotation"""
)
lms_cur.execute(
    "REFRESH MATERIALIZED VIEW fdw_mv_anno;"
) 
lms_conn.commit()

MV can be queried

lms_cur.execute("SELECT count(*) from fdw_mv_anno")
lms_conn.commit()

We simulate an error in the source DB, we stop the docker container here

subprocess.run("docker stop h_postgres_1", shell=True, check=True)

MVs have the data in the local DB, no need to access the FDW, it works fine

lms_cur.execute("SELECT count(*) from fdw_mv_anno")

Running any query against foreign tables fails

try:
    lms_cur.execute("SELECT count(*) from h.annotation")
except psycopg2.errors.ConnectionFailure:
    lms_conn.rollback()

Performance

Below some performance testing on the test RDS instances. Also run some join queries using both foreign and local data.

In this testing I can't see a huge impact/overhead of the FDW usage.

I think the usage of a replica is orthogonal to both performance and the use of FDW however running long queries in replicas bring a whole new set of complexity as they can fail with stale data which is very likely on the context of BI.

Replication can be tuned to be this less likely but at the cost of performance and/or memory consumption on the primary.

https://www.postgresql.org/docs/14/hot-standby.html#HOT-STANDBY-CONFLICT

We should consider the option to point FDW directly to the H primary:

Avoids the whole replication category of problems
Allows, if needed, to write from LMS -> H. We don't need this in the current design but it's a possibility.
Full read access to selected tables is the same in both options.

Performance testing results

### Running query against H primary (no FDW) ``` select count(*) from annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 50925.574 ms (00:50.926) h=> select count(*) from annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 17256.063 ms (00:17.256) h=> select count(*) from annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 5152.864 ms (00:05.153) h=> select count(*) from annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 5257.149 ms (00:05.257) h=> select count(*) from annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 9684.687 ms (00:09.685) h=> select count(*) from annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 27806.489 ms (00:27.806) h=> select count(*) from annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 3153.568 ms (00:03.154) h=> select count(*) from annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) ``` ### Running query against the H replica (no FDW) ``` select count(*) from annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 190490.385 ms (03:10.490) h=> select count(*) from annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 151658.071 ms (02:31.658) h=> select count(*) from annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 72298.204 ms (01:12.298) h=> select count(*) from annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 2142.585 ms (00:02.143) h=> select count(*) from annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 1832.317 ms (00:01.832) h=> select count(*) from annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 1672.943 ms (00:01.673) h=> select count(*) from annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) ``` ### Running queries from LMS replica pointing to the H replica ``` lms-prod=> select count(*) from h.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 3760.586 ms (00:03.761) lms-prod=> select count(*) from h.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 3574.357 ms (00:03.574) lms-prod=> select count(*) from h.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 3740.016 ms (00:03.740) lms-prod=> select count(*) from h.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 3766.915 ms (00:03.767) ``` ### Running queries from LMS replica pointing to the H primary ``` lms-prod=> select count(*) from h_primary.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 21130.083 ms (00:21.130) lms-prod=> select count(*) from h_primary.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 3565.275 ms (00:03.565) lms-prod=> select count(*) from h_primary.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 3015.310 ms (00:03.015) lms-prod=> select count(*) from h_primary.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 3429.981 ms (00:03.430) lms-prod=> select count(*) from h_primary.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) ``` ### Running queries from LMS primary pointing to the H replica ``` lms-prod=> select count(*) from h.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 3635.325 ms (00:03.635) lms-prod=> select count(*) from h.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 3619.462 ms (00:03.619) lms-prod=> select count(*) from h.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 3879.840 ms (00:03.880) lms-prod=> select count(*) from h.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 3634.974 ms (00:03.635) ``` ### Running queries from LMS primary pointing to the H primary ``` lms-prod=> select count(*) from h_primary.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 3332.119 ms (00:03.332) lms-prod=> select count(*) from h_primary.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 3091.252 ms (00:03.091) lms-prod=> select count(*) from h_primary.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 3229.864 ms (00:03.230) lms-prod=> select count(*) from h_primary.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) Time: 3188.033 ms (00:03.188) lms-prod=> select count(*) from h_primary.annotation where updated >= '2022-09-01'; count --------- 2224772 (1 row) ```

hypothesis / lms