fauna-labs / fauna-schema-migrate

The Fauna Schema Migrate tool helps you set up Fauna resources as code and perform schema migrations.
MIT No Attribution
88 stars 11 forks source link

This repository contains unofficial patterns, sample code, or tools to help developers build more effectively with Fauna. All Fauna Labs repositories are provided “as-is” and without support. By using this repository or its contents, you agree that this repository may never be officially supported and moved to the Fauna organization.


Fauna Schema Migrate

Deprecation Notice

This project is deprecated and no longer in development

Fauna has introduced a suite of officially supported schema language and tooling that fully replace Fauna Schema Migrate. These are the Fauna Schema Language (FSL), Fauna Command Line Interface (Fauna CLI), and GitHub and GitLab integrations. Together, these features allow you to apply DevOps principles to database development, and incorporate schema-level changes into broader CI/CD workflows.

The Fauna Schema Language is a declarative language for expressing your entire database schema as a set of .fsl files. FSL files are intended to be managed in the same way that as users manage their application code. That is, FSL files can be managed in common revision control systems, where they can be shared, peer-reviewed, and version tracked, and where changes are automated into users' continuous integration and continuous deployment pipelines.

The Fauna Command Line Interface has been expanded to support database schema management. The Fauna CLI is a standalone tool through which users send FQL queries, upload CSV files, and now most recently, upload or download FSL files to and from your database. It is designed to be integrated into your development workflows, and is therefore, now the recommended and officially supported way to manage schema.

Introduction

Goals

The Fauna Schema Migrate tool is an opinionated tool with three goals:

What is a 'schema' migration?

Schema is a confusing name but typically what we search for for this sort of thing. The Fauna resources such as Functions, Collections, Indexes, Roles and AccessProviders are considered schema. Although data migrations can be done as well, this is a very different problem, see the section on data migration for more information.

Setup

Install the tool in a repository that has the faunadb javascript driver installed. The javascript driver is a peer dependency and will therefore not be pulled in by the tool. It makes more sense to rely on the same fauna driver as the repository to allow users to be in control of that version.

npm install @fauna-labs/fauna-schema-migrate

Run interactively with npx:

npx fauna-schema-migrate run

Or with:

node_modules/.bin/fauna-schema-migrate

The minimum driver version should be:

"faunadb": "^4.0.3"

This driver contains some minor fixes to the FQL formatting which are required for the tool to work properly.

Available commands

Run npx fauna-schema-migrate to see all commands and extra options.

Commands:
  run                             Run interactively
  init                            Initializing folders and config
  state                           Get the current state of cloud and local migrations
  generate                        Generate migration from your resources
  rollback [amount] [childDb...]  Rollback applied migrations in the database
  apply [amount] [childDb...]     Apply unapplied migrations against the database

Or run npx fauna-schema-migrate run to test it out interactively. All commands can be run interactively as well by using run. The command parameters from rollback and apply are not (yet) accessible interactively though.

Flow

  1. Initialize standard folders with npx fauna-schema-migrate init or by using the interactive run command.

    Init will generate a fauna folder and a .fauna-migrate.js config file. If you don't want to change anything to the default folders or default collection to store migration state, you can remove the .fauna-migrate.js file.

    The default folder structure which will be generated by init looks as follows:

    fauna > resources
         > migrations
  2. Write resources

    Create your resources by adding .fql or .js files with a default export to the fauna > resources folder. Any file, even in sub-folders will be picked up by the tool so you can organize your files as you wish. For example you could organize files as follows:

    fauna > resources > users_collection.fql
                  > users_by_name_index.fql
                  > logged_in_users_role.fql

    Or use folders and order by resource type or domain:

    fauna > resources > collections > users.fql
                  > indexes     > users_by_name.fql
                  > roles       > logged_in_users.fql

    It doesn't matter, there is only one caveat and that is to avoid the foldername 'dbs' unless otherwised configured since that name is reserved to support child database migrations. Resource files always have to contain one (and only one) of the following functions.

    CreateCollection(...)
    CreateIndex(...)
    CreateFunction(...)
    CreateRole(...)
    CreateAccessProvider(...)

    For example, the following is a valid .fql file:

    CreateIndex({
     name: 'users_by_alias',
     source: Collection('users'),
     terms: [
       {
         field: ['data', 'alias'],
       },
     ],
     unique: true,
     serialized: true,
    })

    When using .js files, you can import/write anything you want as long as the default export is one of the above. Below is an example from Fwitter that creates a function based on an external file:

    import faunadb from 'faunadb'
    
    import { CreateFweet } from '../fauna/src/fweets'
    const q = faunadb.query
    const { CreateFunction, Query, Var, Lambda, Role } = q
    
    export default CreateFunction({
     name: 'create_fweet',
     body: Query(Lambda(['message', 'hashtags', 'asset'], CreateFweet(Var('message'), Var('hashtags'), Var('asset')))),
     role: Role('functionrole_manipulate_fweets'),
    })
  3. Generate migrations with npx fauna-schema-migrate generate or via the interactive tool started by run

    Generate will look at the resources folders and the migrations folder and determine what has changed in the resources folder since the last time we ran generate. Generate will generate the FQL for the migration in .fql files. The tool will verify whether all resources that are referenced are present in the resources. For example, you can't currently create a role that secures a collection which is not present in your resources folder.

    Why this intermediate step? This is useful since migrations can also contain JavaScript which might be imported from other files to maximize reuse and FQL composition. We would risk to change old migrations if we change external JavaScript files. Generate essentially sets the migration in stone by taking the FQL that the JS file generated and turning it into a pure standalone .fql file. Changing externally imported files will no longer impact the migration (but when running generate, will trigger a new migration if the resulting query has changed)

  4. Verify the migration

    The tool is still in beta and relies heavily on comparison of FQL json formats. It has been tested in several advanced scenarios. When you start using it early on. Verify whether the migration results are correct in the migrations folder. The migration folder will contain different migrations folders with a timestamp as name. Each generated migration contains an .fql file for each resource that was created/updated/deleted. Below is an example structure:

    fauna > migrations > 2021-01-25T20:49:16.074Z > create-collection-accounts.fql
                                              > create-collection-comments.fql
                                              > create-index-accounts-by-name.fql
                   > 2021-02-09T15:49:58.115Z > create-collection-fweets.fql
                                              > create-index-all_fweets.fql

    Do not edit migrations by hand except if you are never going to use generate. If there is a desire for writing migrations by hand, let us know, we can add another modus for that later on. Migrations can contain the following top-level statements.

    CreateCollection(...)
    CreateIndex(...)
    CreateFunction(...)
    CreateRole(...)
    CreateAccessProvider(...)
    Update(...)
    Delete(...)
  5. Apply or rollback as desired.

    Once there are migrations, use apply and rollback to apply or roll back your migrations one by one on the database.

    npx fauna-schema-migrate apply

    npx fauna-schema-migrate rollback

    In contrast to existing tools to aid managing fauna resources, these functions will combine all to be executed resource migrations in one transaction by using a Let statement. Either the full transaction will fail or complete, nothing will be applied partially. The migration query will be printed for educational purposes, if the print becomes big you can opt to remove it by passing -n or --no-print .

    Both rollback as apply can be run from the interactive tool as well and both take two options (which are not available in the interactive tool). The first option is the amount of migrations to apply/rollback, the second option is to support child databases as explained further. Applying multiple migrations at once will combine these multiple migration steps in one transactions intelligently. This is an optimization intended to speed up setting a database from scratch in development scenarios.

  6. Optionally verify the state.

    npx fauna-schema-migrate state

Extras

ENV variables:
Faster local development with a child database

One of the parameters, FAUNA_CHILD_DB is useful in development in case you often find yourself completely nuking the database and reapply everything from scratch. In that case, you might bump into the cache which requires you to wait for 60 seconds before recreating the resources. FAUNA_CHILD_DB will create your resources in a child database instead of in the database that the admin key points at. When using FAUNA_CHILD_DB and rolling back before the first transactions (e.g. with npx fauna-schema-migrate rollback all) the rollback will nuke the database instead of applying the rollback which essentially avoids the 60 seconds cache.

Managing multiple databases

Support for child databases is provided by adding the folder dbs in the resources directory. This allows to define resources in child databases (recursively). A potential structure could be:

fauna > resources > ... resources for the root db ...
                  > dbs > nameOfchildDb1 > ... resources ...
                        > nameOfChildDb2 > ...
                        > dbs > nameOfChild1OfChildDb2 > ...

The 'dbs' name can be configured in the config. When running npx fauna-schema-migrate generate the tool will generate migrations to create child databases as well as generate migrations for all resources within the child databases. The above would result in a migration on the root database to create the two child databases and a dbs folder within the migrations folder that will contain migrations for all child database etc. Migrations of child databases are applied completely separate from the root database. We can currently not update multiple databases transactionally and some generate steps might not have a new migration for a specific database. Therefore, it might not make sense to have an apply command over multiple databases.

When running npx fauna-schema-migrate apply or npx fauna-schema-migrate rollback only the database migrations of one database will be applied depending on the parameters, by default that will be the root database which the admin key points at. Therefore, running npx fauna-schema-migrate apply without parameters will run the migrations of the root database.

Running npx fauna-schema-migrate apply 1 nameOfChildDb1 will apply a migration on the first child database. Running npx fauna-schema-migrate apply 1 nameOfChildDb2 namdOfChild1OfChildDb2 would apply a migration on the child of the second child databases. This functionality is not (yet) available in the interactive version.

Where to go from now?

Potential extensions

Please let us know your requirements so we can decide what would be a good next step.

Philosophy

These are some extensive notes on why the tool is designed as it is.

Keep FQL composable

The FaunaDB Query Language (FQL) is very different than other query languages like SQL, Cypher or ORM approaches. One of the features that are worth embracing in FQL is that writing FQL queries is essentially function composition. For example:

Collection("accounts")

is just a reference to a collection. Throw that in another function to get a reference to a document within that collection.

Ref(Collection('accounts'), '268431417906561542')

That beautiful part is that these are just functions that define your query and we can use the host language (in this case JavaScript) to start saving snippets of queries for reuse. A very silly example would be:

const accountCollection = Collection("accounts")
const account = Ref(accountCollection,"268431417906561542")
...

But you can imagine when queries implement complex transactional logic or start joining extensively that this becomes very useful. It's something we take advantage of in many examples such as Fwitter or the authentication skeletons to define return formats only once or write reusable snippets that can be easily mixed into other queries. For example, a query that creates a new user but also implements rate limitation and adds validation could be abstracted away in JavaScript as follows:

function CreateUser(email, password) {
    let FQLStatement = <query to create a user>
    FQLStatement = ValidateEmail(FQLStatement, email, ['data', 'email'])
    FQLStatement = ValidatePassword(FQLStatement, password, ['data', 'password'])
    return AddRateLimiting('register', FQLStatement, email)
}

What ValidateEmail, ValidatePassword or AddRateLimiting do is not important, but it's important that this composability might be used and we want to make sure it doesn't conflict with our migrations. If we directly use the widely used approach of many migration frameworks of writing up/down statements as popularized by Rails (as far as I know) . For example:

module.exports.up = () => {
  return CreateCollection({ name: 'Users' })
}

module.exports.down = () => {
  return Delete(Collection('Users'))
}

This might be a bad idea in case the up definition would have looked like:

import { CreateOrUpdateFunction } from './../helpers/fql'

module.exports.up = () => {
  CreateFunction(
    'register',
    Query(Lambda(['email', 'password'],
      RegisterAccountWithEmailVerification(Var('email'), Var('password'))))
  )
}
...

Your migration is no longer fixed as this code that was important can evolve separately from the migration and that's probably hard to manage. This is something that's quite unlikely for the creation of indexes and collections but becomes a real problem for User Defined Functions that want to share some logic.

Do less work and reduce the possibility of errors.

The idea of this schema framework library is not only to help support stable schema migrations. First and foremost it's intended to help express FaunaDB resources as code. In essence support the resources equivalent of Infrastructure as Code (IaC) but you can't really call this 'infrastructure' since FaunaDB handles all infrastructure for you.

If we define a function, index or collection as follows

module.exports.up = () => {
  CreateFunction(name, body)
}

The down function can easily be derived so it would be a bit silly to ask the user to write this. Especially since down migrations are often not tested thoroughly so it's easy to get them wrong.

Instead we chose to let the user define the Create statement and will derive the down statement automatically when rolling back or even derive Update statements when we noticed that resources has changed since the last migration.

Detect dependencies.

Functions can depend on roles or can depend on collections and typically you would define these in one migration. This requires you sometimes to:

Instead of requiring the user to think of this order we'll detect it and order the different resources in one migration automatically.

A nice overview of resources

When writing fauna examples myself, I like my resources to be nicely presented making easy to find resources by type or by domain. Up/down migrations don't deliver that and scatters your resources in an unorganized series of files. Having a resource folder with simple names without timestamps which I can organize how I want, in either FQL or JS files brings more clarity to a repository.

A format to exchange Fauna functionality?

If the approach is embraced by our community, we could have an easy uniform format to share schema files in repositories. I can now package a few fauna repositories in a GitHub repository that tackle specific functionality. For example,

Notes on data migration

This library is not intended for data migration. Thats essentially since pagination kicks in when one tries to write huge data migrations which means that one huge data migration can't be performed in one transaction. Such huge data migrations rarely have a rollback script specified. Instead developers typically take a backup before they do it, test it extensively on real-life data or ideally.. avoid the need for big data migrations completely. If necessary, it's often a sign that the migration strategy of an application should be revised since a huge migration transaction that runs for several hours is also a big risk.

Alternative strategies:

Instead you could start the migration at the moment you deploy the new code, keep the timestamp when the migration started and after completing the batches, verify which new data that arrives that was still inserted with the old code and should be migrated and go through the pages again. Until we arrive at a point that the 'new data' is an empty page in which case we successfully migrated all data.

This clearly requires your code to be able to deal with both the old format as the new format of the data in given certain timespan which is something that's advisable regardless of the database. There are moments where this is simply not possible due to a difficult implementation process that involves many services. In that case there are two things you can do:

Need it anyway?

Maybe there is a need to have small seed data in your database or there might be value in looking into a separate npm library to help with such complex data migrations. Let us know what you need, your feedback is important to us.

Inspiration

This library was inspired by excellent database libraries and efforts of community users such as