yoonka / migresia

A simple Erlang tool to automatically migrate Mnesia databases between versions
BSD 2-Clause "Simplified" License
25 stars 6 forks source link

Doesn't quite work on distributed Mnesia nodes #1

Open lostcolony opened 10 years ago

lostcolony commented 10 years ago

Right now attempting to run migrate on an individual node that is part of a multinode Mnesia config will cause it to fail if any upgrader attempts to call mnesia:transform_table, due to the upgrader code only being loaded on a single node, rather than all of them.

Similarly, if an Mnesia node goes down, I'm unable to run check (though by itself this isn't particularly meaningful).

I've addressed these, and in general tried to make programmatic usage of this library a little friendlier, in a fork, but it entailed a fairly significant refactoring, and included some unrelated design decisions that may or may not be in the spirit of what this code was initially intended to be (things like making check and migrate take an app, so that I can have my migrations tied to the apps, rather than the Migresia dependency), to where I'm reluctant to open a pull request. Still, I figured the issues in a distributed scenario was worth mentioning.

amiramix commented 10 years ago

Thanks for forking and letting me know about the changes. I didn't test any more complex scenarios, like the distributed Mnesia nodes, as I didn't have the necessary set up to do it. Thanks for adding those. When the time allows I will review your changes and possibly integrate them back. The intended use of this library was that it could be shipped (e.g. from a build server to production or staging server) in an embedded Erlang installation (e.g. together with all other applications running on the node) and used to upgrade the database to accommodate for any changes in the structure of the database. If your changes allow the same then they are surely in the spirit of the original code.

lostcolony commented 10 years ago

It should; it could even be written to work mostly the same way as the existing code, where -

migresia:migrate()

behaves nearly identically to calling the new code with

try ok = migresia:migrate(migresia) catch _:Reason -> exit(Reason) end

As I've made it so it will never exit by itself, and while I've tried to make it mostly return ok or an {error, Reason} tuple, I haven't changed all of them, so a few exit calls I just turned to throws. And you can have upgraders in any application, so to pull the ones from migresia, you specify it.

About the only variation I can think of in behavior at that point is that the version I have also ignores non-erlang files in the priv/migration directory (in case people put a read me, or use subversion, or similar); they'll just be filtered with an io:format noting them, rather than resulting in an error.

amiramix commented 9 years ago

Just reviewing and merging the changes as now I need to handle a distributed mnesia too. Thanks again for contributing. All looks good, but I would like to better understand the need for migrations per application. If all the applications are running on the same node does it make sense to apply the migrations separately? The timestamps need to be unique anyway. Is it just about separation of concerns?

lostcolony commented 9 years ago

It's separation of concerns, but for a practical reason. I want to be able to upgrade dependency versions (of which Migresia is included; in case there's a change or fix that needs to be made) without losing my migrations. I also don't want to be forced to keep libraries github is hosting in the same repository my code lives (that is, if I write something I want to put on github, that uses Migresia, why should I have to bundle a checked in version of Migresia, with my migrations, with it? When Bob goes to get my library, he necessarily is connected to github; rebar or erlang.mk or whatever I'm using will automatically go retrieve Migresia as part of the build process).

Plus, Rebar, at least (and possibly erlang.mk) don't keep state about what version of a dependency you have; to upgrade, you modify rebar.config, delete the old dependency, and rerun rebar get-deps (and similar for erlang.mk). To have to dance around a priv directory complicates that.

If instead I bundle migrations with my own app, though, I can cleanly delete the dependency directory any time I want, to take in any changes others may have made. Plus, if I'm running multiple applications together, each with their own mnesia tables and upgraders, I don't have to tie them together or have name collisions; I can just run migresia:migrate(App1), migresia:migrate(App2).

Given I'm bundling upgraders with my own app, then, and that I wanted to be able to run them from a shell, it seemed to make sense to just pass in the application they're being supplied under. If you wanted to maintain the existing api, given that migresia includes an application file, you could put in a default version of the exported functions that take an an application that take one less argument, to pass in 'migresia'. I.e.,

migrate() -> migrate('migresia').

That would address both use cases.

amiramix commented 9 years ago

Thanks for the explanation. In my README I stated that I assume someone is creating a link from the migresia priv folder to the folder with migrations. In my system that link was created in the makefile and the folder was somewhere else, in the same repo as the rest of the system.

I acknowledge that having it per application is more flexible, however it complicates rollbacks a little bit. What if the application containing the migrations no longer exists in the project? And it also doesn't play well with OTP releases. When updating a running system we would need to try to apply all migrations for all applications and then rollback all of them if the upgrade wasn't successful. It doesn't invalidate the idea, it's an interesting idea, but I think it may benefit from some additional work, (which I will try to do in the due course, e.g. information about applied migration could contain the application name in addition to the timestamp).

However, I have one more question about this code: %% Basically just to ensure everybody has loaded all the migrations, %% which is necessary in distributed Mnesia transforms. rpc:multicall(migresia_migrations, list_all_ups, [App])

Why do we need to load migrations on other nodes? We don't execute them there? If the database scheme is shared then surely the remote database will be updated automatically when the system is started?

lostcolony commented 9 years ago

Links aren't portable, and would allow only one application to have upgraders per release then (since you'd need two directories at that point).

Yes, having an application move to two applications, and then wanting to downgrade back to one, would require some code restructuring (or at least, supplying both in the release, even if not starting one). It's resolvable from a consumer perspective though. For downgraders, you'd need to run upgraders for App1, undo/downgrade/whatever in case of failure, and then do the same for App2. Presumably you don't have dependencies between the apps, so you should be able to upgrade/downgrade them in isolation anyway. If you -do- have dependencies, it's up to you, the user of the lib, to resolve them by placing your upgraders/downgraders, and calling them, appropriately (easiest thing would be a container app, under which you put your upgraders/downgarders, that has included_applications of both App1 and App2, and under that put your priv migrations; that's actually what I did for one project, and I could use the same migresia checkin that a second project used, because there was no collision in priv directories under migresia. So my 'specific to this app' code lived in my app, and the Migresia library was reusable between projects).

Yeah, I know that seems weird, hence the comment. Try taking it out though and see what happens . When I ran migrate() without it, I got exceptions, and I found some discussions indicating that running a transform from source like that has to have the beam loaded across all nodes. Here, for example, is one - http://toddhalfpenny.com/2012/05/21/possible-erlang-bad-transform-function-solution/

amiramix commented 9 years ago

OK, it makes sense, thanks.

amiramix commented 9 years ago

Just a thought, you could well enough call rpc:multicall(migresia_migrations, list_unapplied_ups, [App]) instead of list_all_ups since the scheme on the remote table will already contain the ups applied on that node?

lostcolony commented 9 years ago

Yeah, that should work. Just, anything that ensures the migrations that you're going to run have been loaded.

pjkCochin commented 8 years ago

Hi Guys, we have been trying to use Migresia for handling Mnesia schema changes for our application.

  1. We are not able to find that the method migresia:create_new_migration/2. is this not exposed? Please let me know if I am missing something here.

Thanks, Paul

amiramix commented 8 years ago

@pjkCochin Thanks for trying it out. I replied to your original ticket. Let me know in case you have any other issues or questions and I will be happy to help.