Investigage introducing schema migration tool to meltano db

asmisha commented 3 years ago

Currently if we add a new DBT model and also add it to hasura - hasura will fail to start until transformation is executed. The proposed solution is managing meltano model tables schemas through a 3rd party tool. The one i found that suits our needs is https://www.prisma.io/ It's able to generate migrations based on previous migrations (not the current db schema) which will allow us to nearly-automate the process of keeping db schema in sync.

vasilii-gainy commented 3 years ago

Discussion notes:

Problem: in case of changes in schema Hasura doesn't start until we run Meltano pipeline. Meltano doesn't have a working mechanism for DB migrations.

Solution:

Option 1: Prisma tool could be useful for this migration
Option 2: Use the existing mechanism (Hasura migrations) and add migrations manually to support all changes in public schema; restrict Meltano from making any changes in schema (how?)
Option 3: each new version of Meltano creates a new version of table, re-point Hasura to the new version of table

Considerations:

Hasura should be considered as the primary service and shouldn't fail in case of errors in ELT pipelines.

asmisha commented 3 years ago

The problem lies a bit deeper. From what I see, hasura will be down the whole time dbt is working, because of the way it works with tables and foreign keys. This means that hasura must be completely disconnected from meltano until it runs the first transformation and while it runs every subsequent transformations.

This means that our full service deployment may look something like this:

meltano deployed
run full ETL process
start hasura

This will give us a working environment. Each time we need to redeploy / rerun ETL (meaning, daily) we will:

execute full service deployment on a new environment (meltano, lambda, rds)
route all graphql traffic to the new environment
drop the old environment

The problem with this approach is that we should preserve app schema and keep it in sync between environments. However hasura has its own migrations, which leads to a situation, where

old hasura can't work with new meltano schema
new hasura can't work with old hasura schema

BTW considering this I can't think of an approach to have multiple hasura instances at all and thus having deployment process without a downtime.

@vood @vasilii-gainy what do you think?

asmisha commented 3 years ago

For now let's consider delivering changes schema changes in two steps:

deliver dbt models and run the pipeline
deliver hasura + test changes

gainy-app / gainy

Investigage introducing schema migration tool to meltano db #79