AmpersandTarski / Ampersand

Build database applications faster than anyone else, and keep your data pollution free as a bonus.
http://ampersandtarski.github.io/
GNU General Public License v3.0
40 stars 8 forks source link

Generate a data migration script #1393

Open stefjoosten opened 1 year ago

stefjoosten commented 1 year ago

As a migration engineer, I want to change an existing system (in production) in increments. For each increment, I want to migrate the population to the new system without loss of semantics. For this purpose, I want to compile an Ampersand script (the new system) together with the script of the existing system to generate a migration script.

Task

In the Ampersand compiler, write the code for generating the migration script.

Problem

The problem we are trying to solve is that changing an Ampersand system in production requires a migration engineer to move the population "by hand". This occurs e.g. in the Semantic Treehouse project. In RAP we have always avoided this problem by resetting the database and removing the existing contents, so we just skipped the migration.

Analysis

A migration script is an Ampersand script that specifies the migration system. The migration system includes the existing system and the new system, each in a separate namespace. The migration script includes ENFORCE rules to transfer the population of the existing system to the new system. This script can be altered by the migration engineer to incorporate specific migration requirements. When deployed, the migration script works on the existing data set and on the new data set simultaneously.

A formal analysis is underway. We have decided to finish that before starting any work on this issue. This issue depends on completion of issue #1307.

Proposal

The Ampersand compiler can compare two scripts (ideally taken from different commits in a Git repo) and generates a Migration script from it. The migration engineer can fine-tune this script and run it to do the migration. The purpose is to accelerate migrations in which the database schema changes, so the maintenance team can deploy increments more frequently.

This issue is based on the following assumptions and requirements: