knex / knex

A query builder for PostgreSQL, MySQL, CockroachDB, SQL Server, SQLite3 and Oracle, designed to be flexible, portable, and fun to use.
https://knexjs.org/
MIT License
19.23k stars 2.12k forks source link

Proposal: Migration overhaul #2613

Closed marcelcremer closed 6 years ago

marcelcremer commented 6 years ago

Hi everyone,

first of all, thank you very much for your effort and the great lib knex.js. I'm currently using the migration part of knex.js for database versioning and it's awesome. However, I also found some difficulties that I think could be improved.

Feature discussion / request

1. Explain what is your use case

I'm developing a large standard product for our customers. However, we also have customers with special needs who need additional tables and/or default configuration in their seeds. Last but not least, there might be different test cases that require different amount of data (e.g. RESTFul Integration tests only need some data, performance tests need huge amounts of data etc.).

Current solution

We have a knexfile.js with many environments. Each environment has a specific purpose, for example:

The same goes for the seeds. Each environment might also have specific seeds. As we need to be able to rollback, a typical environment "reset" would look like:

knex migrate:latest
knex migrate:latest --env=plugin1
knex migrate:latest --env=plugin2
knex migrate:latest --env=myCustomer1
knex seed:run
knex seed:run --env=plugin2
knex seed:run --env=myCustomer1
knex seed:run --env=integrationTestSeed

... I think you get the point here ;)

Main Problems about this

I'd love to have environments really be "environments", which are dynamically described by

2. Explain what kind of feature would support this

My proposal would be, to split the knexfile.js in different sections for the settings named above. Also, I'd like to rename the seeds to something more dynamic (e.g. scripts), so that someone can differentiate between cleanup scripts (e.g. for runtime data not covered by a seed), a "real" seed, base configuration data etc..

Here's an example of what I think could fit all needs: Basic Example

module.exports = {
  connections: {
    development: {
      client: "postgres",
      connection: {
        database: "xxx",
        host: "xxx",
        user: "xxx",
        password: "xxx",
        charset: "utf8"
      },
      searchPath: "public, mycoolschema",
      pool: { min: 2, max: 10 }
    }
  },
  migrationstable: "public.migration",
  migrations: {
    basestate: "./db/migrations/basestate",
  },
  scripts: {
    "basestate": "./db/seeds/basestate",
  },
  environments: {
    development: {
      connection: "development",
      migrations: ["basestate"],
      scripts: ["basestate"]
    }
  }
};

Complex Example

module.exports = {
    connections: {
      development: {
        client: "postgres",
        connection: {
          database: "xxx",
          host: "xxx",
          user: "xxx",
          password: "xxx",
          charset: "utf8"
        },
        searchPath: "public, mycoolschema",
        pool: { min: 2, max: 10 }
      },
      customer1: {
        client: "postgres",
        connection: {
          database: "xxx",
          host: "xxx",
          user: "xxx",
          password: "xxx",
          charset: "utf8"
        },
        searchPath: "public, anothercoolschema",
        pool: { min: 5, max: 10 }
      }
    },
    migrationstable: "public.migration",
    migrations: {
      basestate: "./db/migrations/basestate",
      plugin1: "./db/migrations/plugin/1/",
      customer1: "./db/migrations/customer/1/"
    },
    scripts: {
        cleanup: {
            "resetDB": "./db/cleanup/resetFull",
            "resetRuntimeData": "./db/cleanup/resetRuntimeData",
        },
        seeds: {
            "basestate": "./db/seeds/basestate",
            "customer1": "./db/seeds/cust1",
            "customer1": "./db/seeds/basestate",
        },
       "test": {
           "integration": "./db/test/integration",
           "performance": "./db/test/performance"
       }
    },
    defaultEnvironment: "development",
    environments: {
      development: {
        connection: "development",
        migrations: ["basestate", "plugin1"],
        scripts: ["cleanup/resetRuntimeData", "seed/basestate"]
      },
      integration: {
        connection: "development",
        migrations: ["basestate", "plugin1"],
        scripts: [
          "cleanup/resetRuntimeData",
          "seed/basestate",
          "test/integration"
        ]
      },
      customer1Performance: {
        connection: "development",
        migrations: ["basestate", "customer1"],
        scripts: [
          "cleanup/resetRuntimeData",
          "seeds/basestate",
          "seeds/customer1",
          "test/performance"
        ]
      }
    }

Of course one might split the different sections (connections, migrations, scripts, environments) into different files, which is more flexible this way.

3. Give some API proposal, how the feature should work

The API would basically stay the same, the only difference is, that any trigger of migrate / seed will cause all of the configured migrations / seeds to be run, e.g.

knex migrate:latest

runs (for default development)

The same is true for seed:run.

Necessary Changes

In my opinion, this would also solve #607.

I'd love to hear your feedback, because I'm not sure if I'm thinking all weird and everything could be achieved easily otherwise ;)

elhigu commented 6 years ago

I might not really understand the main problem you have there. To me it looks like all your problems could be resolved by normal methods of using javascript to prevent need for code duplication (like using variables separate files having configuration fragments which are required in in correct places etc.). And with --knexfile switch of the commanline client.

I don't like much the idea of putting bunch of different configurations in the same knex file (like I actually don't like that current knex file allows having many configurations, which has caused lots of confusion during the years). I would prefer knex client allowing just do declare one configuration (with possibly multiple migration directories to solve #607, but thats a bit separate issue).

I agree that current functionality of the knex client is not really optimal, but if we go this route, it will make knex client and configuration more complex, but it won't still be very useful to support someone else's a bit different use case.

I would like to make basic knex file more simple not more complex and guide people to use javascript to actually achieve more complex functionality for their specific configuration needs. In other words, to provide minimal useful building blocks to help people writing their own specialised behaviour.

For this case I would suggest writing own command line client for knex (using knex's migration APIs) which actually does in optimal way that exact functionality that you need. Or write a bit more javascript in knexfile.js.

(I didn't close this yet because the feature request was well written and I would like to see more feedback for this first)

marcelcremer commented 6 years ago

@elhigu Thank you very much for your feedback.

What I did for the moment was to implement another layer above the knex configuration, which "builds" different knex configurations and executes them programatically. I still have seperate tables to track the migration state for each directory which is not a nice, but a pragmatic solution.

I think, this also was my main concern, as a full "solution" has tables like

base_migrations plugin1_migrations customerx_migrations

to track their migration state independently, which feels not very elegant.

Also, what just came to my mind is the rollback function, as it's not aware, that those migrations need to be rolled back in reverse order and they don't know about each other.

However, I see your point that it would make the basic configuration more complicated - especially for simple use cases.

Maybe I need to sleep a night or too about it to come up with a better (or easier) solution.