moigagoo / norm

A Nim ORM for SQLite and Postgres
https://norm.nim.town
MIT License
380 stars 34 forks source link

Implement migrations for adding missing columns and other schema modifications #21

Closed zedeus closed 4 years ago

zedeus commented 5 years ago

When adding new fields to objects, reusing the same database throws the "column missing" error. For sqlite new columns can be added with the ALTER TABLE command. It would be great if this could either be done automatically, or explicitly via a function call, not sure what the best approach would be. This would prevent the need for migration scripts which can grow out of hand for cases where lots of fields get added over time. Since norm aims to make the usage of databases from Nim more convenient, I think this would be a welcome addition.

moigagoo commented 5 years ago

Migrations is a big and complex topic. I'm not sure how to do them yet in a way that would be simple to implement yet useful for the user. Could you please provide an example of a situation that requires a migration?

moigagoo commented 4 years ago

@zedeus I'm currently working on providing the means for writing migrations in this PR: https://github.com/moigagoo/norm/pull/30

The plan is to add the necessary building blocks here and create a CLI tool to generate and apply migrations.

zedeus commented 4 years ago

Very cool, thank you for working on this. Have you looked into using SQLite's ALTER TABLE? Your method seems like it will generally work, but I have some concerns regarding performance when dealing with large tables.

moigagoo commented 4 years ago

@zedeus Yeah, sure, I've looked into ALTER TABLE approach. It seemed harder to implement and wouldn't give me as much flexibility (SQLite doesn't support remove table, for example). I'm making decisions as I go, and I don't want to to implement too much when I'm not even sure I'm doing the right thing. Hope this makes sense.

I have a vision of how to deal with migrations , and it requires writing an additional CLI tool, but I have to implement the necessary basis here in Norm first. So, my goal is to deliver that necessary minimum fast and check if my vision is at all viable.

Of course, switching to ALTER TABLE would be more optimal on large tables. But I thought it was better to make it work, then make it faster. I hope for your understanding, and I hope it won't scare people away. I haven't even fixed the 1+n problem in Norm, so it is far form being optimal at the moment :-)

zedeus commented 4 years ago

No problem, your approach seems good for now anyway. Was just wondering :)

moigagoo commented 4 years ago

@zedeus The PR is ready to me merged. To add a missing column, use addColumn:

  1. You have a model:
    dbTypes:
    type
    Pet = object
      name: string
      age: int
  2. You add new field:
    dbTypes:
    type
    Pet = object
      name: string
      age: int
      breed {.default: "unknown".}: string
  3. You write a migration:
    
    import pet

withDb: addColumn Pet.breed


4. You run the migration to update the DB.

Similarly, you can rename and drop columns and rename tables.

Could you please review the interface I'm going to release and maybe try installing Norm from feature/first_step_toward_migrations branch and trying it out before I merge the PR?
zedeus commented 4 years ago

Hi, thank you again very much for the work put into this. The interface is looking great, elegant and easy to work with.

The requirement of a non-empty default value is a bit annoying. For example, I discovered my Video object needs to have a description field, but it may be empty (I know I can use Option here but this is just in general). I would expect to be able to just add the field and doing addColumn(Video.description), but then I get this: Error: unhandled exception: Cannot add a NOT NULL column with default value NULL [DbError]

So naturally I tried {.default: "".} as a workaround, but again I get an error: Error: unhandled exception: incomplete input [DbError]

I would rather not have to use some default non-empty string. If the default pragma is left out, the column should just be initialized to the type's default value like is already the case when creating a new table.

Something I would like to have is the ability to check if a column exists, but I'm not sure about the way to handle this best. The use case I'm having in mind here is I would prefer to not have a separate migration script for every new column I want to add, but instead just one or very few scripts. If you try to add a column that exists, you get duplicate column name: <field> [DbError]. Being able to check if it already exists would let me create a function/template that would act as a safeAddColumn. It would also allow to check if an old field name exists, which can then be renamed or removed appropriately, allowing for a dynamic migration script.

Let me know what you think about the above, I think it would provide for a much cleaner solution that having a folder with migration scripts that grows over time.

moigagoo commented 4 years ago

So naturally I tried {.default: "".} as a workaround, but again I get an error: Error: unhandled exception: incomplete input [DbError]

Yeah, this bugs me too. I think default: "''" could help, but I'm not sure. Or something like """""""" T_T.

If the default pragma is left out, the column should just be initialized to the type's default value like is already the case when creating a new table.

So, you mean we should add DEFAULT to table schemas by default when default pragma is not provided? I'm double checking because contrary to what you said, this is not the case when creating a new table as of now.

Sounds like a reasonable idea to avoid making the user add new columns with mandatory default or Option type. I'll explore into that but in a separate PR (this one is already way too large).

Alternatively, we could just make default pragma easier to use, i.e. allow empty stings and numbers 🤔

Something I would like to have is the ability to check if a column exists, but I'm not sure about the way to handle this best.

Hmm, I can add an IF NOT EXISTS to the ADD COLUMN queries, seems like the easiest way to solve this particular problem.

The use case I'm having in mind here is I would prefer to not have a separate migration script for every new column I want to add, but instead just one or very few scripts.

Well, this is exactly the way of writing migrations I'd like to have. My inspiration is RoR where you have a lot of trivial migration scripts each of which does one simple thing. Such migrations are easy to generate, apply, and undo. Since my plan is to write a migration assistance for automatic migration generation and application, I think I'll stick to this model.

However, for manually-written migrations, you can use any paradigm you find best for your app. I can't see any problem with adding explicit columnExists(string) proc, seems easy to do. If you think this is something I should add, please open an issue for it (again, this PR has already gone out of hand).

zedeus commented 4 years ago

So, you mean we should add DEFAULT to table schemas by default when default pragma is not provided?

I'm not sure. Currently when creating a table, there are no rows for which to set the defaults, so I can see how this is a new problem. With that said, consider the following:

type
  VideoType* = enum
    vmap, m3u8, mp4

dbTypes:
  type
    Video* = object
      videoId*: string
      durationMs*: int
      available*: bool
      playbackType* {.
          dbType: "STRING"
          parseIt: parseEnum[VideoType](it.s)
          formatIt: dbValue($it)
        .}: VideoType

dbFromTypes("cache.db", "", "", "", [Video])

withDb:
  Video.createTable(force=true)
  var v = Video()
  v.insert()

Create statement:

CREATE TABLE video (
    id INTEGER NOT NULL PRIMARY KEY,
    videoId TEXT NOT NULL,
    durationMs INTEGER NOT NULL,
    available INTEGER NOT NULL,
    playbackType STRING
)

Inserted object:

(id: 1, videoId: "", durationMs: 0, available: false, playbackType: vmap)

In this case where the fields aren't set explicitly and there are no default pragmas, they're set to the types' default values. This can retreived by calling default on the type, so my suggestion is to simply use that when there is no default pragma.

Hmm, I can add an IF NOT EXISTS to the ADD COLUMN queries, seems like the easiest way to solve this particular problem.

This sounds fine.

I can't see any problem with adding explicit columnExists(string) proc, seems easy to do. If you think this is something I should add, please open an issue for it (again, this PR has already gone out of hand).

Great, will do.

moigagoo commented 4 years ago

In this case where the fields aren't set explicitly and there are no default pragmas, they're set to the types' default values.

Let me clarify this please. Nim objects are initialized with the default values of the types of their fields. This is Nim behavior.

So, when you init an empty Nim object and insert it into the DB, you don't have to have DEFAULT for columns. The DB never receives any NULLs only because Nim cares about populating the object fields.

However, if you modify your DB schema by adding a column, this is pure DB manipulation, it has nothing to do with Nim types or Nim's object initialization policy. This is why you either have to set DEFAULT for new columns with default pragma or set it somehow implicitly based on the field's Nim type.

zedeus commented 4 years ago

Right, I understand that. My point was that implicitly using the type's default value for new columns would be equivalent to inserting a new object with that field uninitialized, so it seems to me like the expected behaviour, rather than requiring some custom default value.

moigagoo commented 4 years ago

Setting DEFAULT for new columns to the respective Nim type's default value seems like the right thing to do. Not only will it make writing migrations easier but it also makes the generated schemas more consistent with the objects they are created from.

However, after researching the topic a bit, I couldn't find an easy way to implement it without writing horrible nested ifs. So, I suggest we move that to a separate issue that can be discussed, developed, and released after migrations.

To make writing migrations easier now, I can add default param to addColumn so that you won't have to pollute the schema with it. This is very cheap but has a significant impact: the model abstraction doesn't leak from the migration level.

Also, I'm trying to make default pragma accept typed instead of string so that you could write numbers and empty strings easily, but so far I couldn't make it work. So, this is another thing to be implemented in a separate PR.

moigagoo commented 4 years ago

I've spent some time trying to infer defaults automatically but failed so far. I'm merging the PR that implements migrations as it is just not to let it rot.

Inferring defaults comes hand in hand with rewriting this horrible mess: https://github.com/moigagoo/norm/blob/develop/src/norm/sqlite/sqlgen.nim#L70

I haven't found a way to do that so far, unfortunately.

So, for now, when adding a new column, you just have to provide a default for it, just like with a regular ALTER TABLE ADD COLUMN. I agree, it does look ugly and feels unnecessary, but it's the best I can do right now. Any help is appreciated.

ghost commented 4 years ago

Sorry for the necromancy. Is there any interest in a tool that would automatically generate migrations, similar to how Django does? I feel like it wouldn't be beyond my current skill level, and I'd love to work on it if you're iterested.

moigagoo commented 4 years ago

@3n-k1 Hi and thanks for your interest in Norm!

In fact, I have a plan to start developing a tool to generate migrations right after I'm done with 1.1.0 release. At the moment, I an procrastinating and not writing the docs, which is why it is taking so long.

Any help is very much appreciated. Please look through the open issues marked with milestone 1.1.0.

I think it's good idea to create the repo for this migration tool I have in mind and start the discussion there. I'll do it now and post a link here.

UPD: https://github.com/moigagoo/norman/issues/1

ghost commented 4 years ago

@moigagoo Awesome! Thanks so much for your work :)