marcelcremer commented 6 years ago

Hi everyone,

first of all, thank you very much for your effort and the great lib knex.js. I'm currently using the migration part of knex.js for database versioning and it's awesome. However, I also found some difficulties that I think could be improved.

Feature discussion / request

1. Explain what is your use case

I'm developing a large standard product for our customers. However, we also have customers with special needs who need additional tables and/or default configuration in their seeds. Last but not least, there might be different test cases that require different amount of data (e.g. RESTFul Integration tests only need some data, performance tests need huge amounts of data etc.).

Current solution

We have a knexfile.js with many environments. Each environment has a specific purpose, for example:

Basestate (the default schema of the "standard" solution. This is where others depend of)
One Environment per Customer (contains migrations for additional tables and schemas)
One Environment per "standard" Plugin (contains migrations for that)
One Environment per test case (integration / performance tests)

The same goes for the seeds. Each environment might also have specific seeds. As we need to be able to rollback, a typical environment "reset" would look like:

knex migrate:latest
knex migrate:latest --env=plugin1
knex migrate:latest --env=plugin2
knex migrate:latest --env=myCustomer1
knex seed:run
knex seed:run --env=plugin2
knex seed:run --env=myCustomer1
knex seed:run --env=integrationTestSeed

... I think you get the point here ;)

Main Problems about this

The tight coupling between migrations, seeds and database information cause much config duplication
As someone might need migrations from here and seeds from there, different developers can break things because they don't know what impact a change has
Without writing extra files, it's not possible to "sum up" an environment, because it might take configuration from different parts of the config
We need to inject database information into the knexfile from outside, so that the knexfile itself can be commited to versioning system

I'd love to have environments really be "environments", which are dynamically described by

a database connection (of course)
one or more migration (folders)
one or more seeds (folders)

2. Explain what kind of feature would support this

My proposal would be, to split the knexfile.js in different sections for the settings named above. Also, I'd like to rename the seeds to something more dynamic (e.g. scripts), so that someone can differentiate between cleanup scripts (e.g. for runtime data not covered by a seed), a "real" seed, base configuration data etc..

Here's an example of what I think could fit all needs: Basic Example

module.exports = {
  connections: {
    development: {
      client: "postgres",
      connection: {
        database: "xxx",
        host: "xxx",
        user: "xxx",
        password: "xxx",
        charset: "utf8"
      },
      searchPath: "public, mycoolschema",
      pool: { min: 2, max: 10 }
    }
  },
  migrationstable: "public.migration",
  migrations: {
    basestate: "./db/migrations/basestate",
  },
  scripts: {
    "basestate": "./db/seeds/basestate",
  },
  environments: {
    development: {
      connection: "development",
      migrations: ["basestate"],
      scripts: ["basestate"]
    }
  }
};

Complex Example

module.exports = {
    connections: {
      development: {
        client: "postgres",
        connection: {
          database: "xxx",
          host: "xxx",
          user: "xxx",
          password: "xxx",
          charset: "utf8"
        },
        searchPath: "public, mycoolschema",
        pool: { min: 2, max: 10 }
      },
      customer1: {
        client: "postgres",
        connection: {
          database: "xxx",
          host: "xxx",
          user: "xxx",
          password: "xxx",
          charset: "utf8"
        },
        searchPath: "public, anothercoolschema",
        pool: { min: 5, max: 10 }
      }
    },
    migrationstable: "public.migration",
    migrations: {
      basestate: "./db/migrations/basestate",
      plugin1: "./db/migrations/plugin/1/",
      customer1: "./db/migrations/customer/1/"
    },
    scripts: {
        cleanup: {
            "resetDB": "./db/cleanup/resetFull",
            "resetRuntimeData": "./db/cleanup/resetRuntimeData",
        },
        seeds: {
            "basestate": "./db/seeds/basestate",
            "customer1": "./db/seeds/cust1",
            "customer1": "./db/seeds/basestate",
        },
       "test": {
           "integration": "./db/test/integration",
           "performance": "./db/test/performance"
       }
    },
    defaultEnvironment: "development",
    environments: {
      development: {
        connection: "development",
        migrations: ["basestate", "plugin1"],
        scripts: ["cleanup/resetRuntimeData", "seed/basestate"]
      },
      integration: {
        connection: "development",
        migrations: ["basestate", "plugin1"],
        scripts: [
          "cleanup/resetRuntimeData",
          "seed/basestate",
          "test/integration"
        ]
      },
      customer1Performance: {
        connection: "development",
        migrations: ["basestate", "customer1"],
        scripts: [
          "cleanup/resetRuntimeData",
          "seeds/basestate",
          "seeds/customer1",
          "test/performance"
        ]
      }
    }

Of course one might split the different sections (connections, migrations, scripts, environments) into different files, which is more flexible this way.

3. Give some API proposal, how the feature should work

The API would basically stay the same, the only difference is, that any trigger of migrate / seed will cause all of the configured migrations / seeds to be run, e.g.

knex migrate:latest

runs (for default development)

basestate
plugin1

The same is true for seed:run.

Necessary Changes

the config file structure would change, so that would be a breaking change (update script might be possible if needed)
knex migrate:make needs an additional parameter to address the migration config e.g. knex migrate:make basestate
the defaultEnvironment Parameter would be configurable
To make it possible, that migrations can also be rolled back, an additional column "scope" needs to be introduced in the migration table, which stores the name of the migration configuration which was run (e.g. basestate, customer1,...). The rollback will then be executed in reverse order, per migration "scope".
everything else seems to be basically a reordering of already available functionality, so I'd do my best to give a PR a try...

In my opinion, this would also solve #607.

I'd love to hear your feedback, because I'm not sure if I'm thinking all weird and everything could be achieved easily otherwise ;)

elhigu commented 6 years ago

I might not really understand the main problem you have there. To me it looks like all your problems could be resolved by normal methods of using javascript to prevent need for code duplication (like using variables separate files having configuration fragments which are required in in correct places etc.). And with --knexfile switch of the commanline client.

I don't like much the idea of putting bunch of different configurations in the same knex file (like I actually don't like that current knex file allows having many configurations, which has caused lots of confusion during the years). I would prefer knex client allowing just do declare one configuration (with possibly multiple migration directories to solve #607, but thats a bit separate issue).

I agree that current functionality of the knex client is not really optimal, but if we go this route, it will make knex client and configuration more complex, but it won't still be very useful to support someone else's a bit different use case.

I would like to make basic knex file more simple not more complex and guide people to use javascript to actually achieve more complex functionality for their specific configuration needs. In other words, to provide minimal useful building blocks to help people writing their own specialised behaviour.

For this case I would suggest writing own command line client for knex (using knex's migration APIs) which actually does in optimal way that exact functionality that you need. Or write a bit more javascript in knexfile.js.

(I didn't close this yet because the feature request was well written and I would like to see more feedback for this first)

marcelcremer commented 6 years ago

@elhigu Thank you very much for your feedback.

What I did for the moment was to implement another layer above the knex configuration, which "builds" different knex configurations and executes them programatically. I still have seperate tables to track the migration state for each directory which is not a nice, but a pragmatic solution.

I think, this also was my main concern, as a full "solution" has tables like

base_migrations plugin1_migrations customerx_migrations

to track their migration state independently, which feels not very elegant.

Also, what just came to my mind is the rollback function, as it's not aware, that those migrations need to be rolled back in reverse order and they don't know about each other.

However, I see your point that it would make the basic configuration more complicated - especially for simple use cases.

Maybe I need to sleep a night or too about it to come up with a better (or easier) solution.

knex / knex

Proposal: Migration overhaul #2613