navapbc / template-infra

A template to set up foundational infrastructure for your application in AWS
Apache License 2.0
9 stars 2 forks source link

Use safe names for psql databases and schemas #639

Closed rocketnova closed 3 weeks ago

rocketnova commented 3 weeks ago

Ticket

N/A

Changes

What was added, updated, or removed in this PR.

Context for reviewers

Testing instructions, background context, more in-depth details of the implementation, and anything else you'd like to call out or ask reviewers.

Per the documentation, PostgreSQL identifiers should be letters, underscores, digits, or dollar signs:

SQL identifiers and key words must begin with a letter (a-z, but also letters with diacritical marks and non-Latin letters) or an underscore (_). Subsequent characters in an identifier or key word can be letters, underscores, digits (0-9), or dollar signs ($).

PostgreSQL allows identifiers to include arbitrary characters if the identifier is surrounded by double quotes:

There is a second kind of identifier: the delimited identifier or quoted identifier. It is formed by enclosing an arbitrary sequence of characters in double-quotes (").

The infra template's directory naming convention uses hyphens (-) to separate multi-word names (e.g. app-config, build-repository). The app-config uses the application directory to generate the local.app_name which is used throughout the configuration.

When a project uses the infra template and renames the application directory to a multi-word, hyphenated name, such as flowable-rest, this will cause issues unless every instance of the database and schema names is double quoted. Even though we are properly using the pg8000 identifier in the role manager, while working on the platform-test-rails repo, I still ran into issues with the following lines because my application uses a hyphenated name (i.e. app-rails):

https://github.com/navapbc/template-infra/blob/6f8464ef7ff1cb20a263d0a68705a05e7c3f3206/infra/modules/database/role_manager/manage.py#L104

https://github.com/navapbc/template-infra/blob/6f8464ef7ff1cb20a263d0a68705a05e7c3f3206/infra/modules/database/role_manager/check.py#L32

This PR proposes to protect the database from hyphens by modifying the database name and the schema name to use underscores instead of hyphens.

Breaking change

⚠️ Warning! This could potentially be a breaking change for any existing projects that use hyphenated database names and have worked around the issue!

Testing

Provide evidence that the code works as expected. Explain what was done for testing and the results of the test plan. Include screenshots, GIF demos, shell commands or output to help show the changes working as expected. ProTip: you can drag and drop or paste images into this textbox.

Tested in https://github.com/navapbc/platform-test/pull/104/files:

A screenshot showing that the database name is correctly using an underscore when creating the database resources: CleanShot 2024-06-14 at 14 39 05@2x

Snippet showing successfully using an underscore name when creating the roles:

...
postgres> ALTER DATABASE app_hyphen SET search_path TO app_hyphen
---- Configuring roles
------ Configuring role: username='migrator'
...
---- Configuring schema
------ Creating schema: schema_name='app_hyphen'
postgres> CREATE SCHEMA IF NOT EXISTS app_hyphen
------ Changing schema owner: new_owner=migrator
postgres> ALTER SCHEMA app_hyphen OWNER TO migrator
------ Granting schema usage privileges: grantee=app
postgres> GRANT USAGE ON SCHEMA app_hyphen TO app
---- Configuring superuser extensions
-- Current database configuration
---- Roles
------ Role postgres
------ Role migrator
------ Role app
---- Schema privileges
------ Schema name='public' acl='{pg_database_owner=UC/pg_database_owner,=U/pg_database_owner}'
------ Schema name='app_hyphen' acl='{migrator=UC/migrator,app=U/migrator}'
...

A snippet showing successfully checking roles when using the underscore name:

...
-- Check that search path is %s app_hyphen
migrator> SHOW search_path
-- Check that migrator is able to create tables
-- Clean up role_manager_test table if it exists
migrator> DROP TABLE IF EXISTS role_manager_test
migrator> CREATE TABLE IF NOT EXISTS role_manager_test(created_at TIMESTAMP)
-- Check that app is able to read and write from the table
...
rocketnova commented 3 weeks ago

After discussing with @lorenyu offline, we decided that a better and simpler approach is simply to hardcode the schema name. This reduces unnecessary logic. We will treat the database name, the schema name, the database user/role name, and the database migrator user/role name as special and hardcode them. Closing this PR in favor of a forthcoming PR.