mbucc / shmig

Database migration tool written in BASH.
BSD 3-Clause "New" or "Revised" License
458 stars 49 forks source link

Implemented config option for recursive migrations (refs mbucc/shmig#43) #44

Closed kael-shipman closed 6 years ago

kael-shipman commented 6 years ago

Ok, I've got an initial implementation of this, though I wasn't able to figure out a good posix-find equivalent to gnu-find's maxdepth parameter. (This looked promising, though I didn't have enough time to figure out how to make it work for our purposes.)

I'm submitting this PR now because it solves the problem posed in issue #43 without regressing; that is, since the code already uses gnu find's maxdepth parameter, we don't lose any portability with this implementation, we just don't gain any.

@mbucc , would you be interested in trying to implement posix compatibility on this?

mbucc commented 6 years ago

Excellent, thank you.

I was thinking of simply removing the -(min|max)depth args completely. Then there is no need for a new config option.

Any objection?

I think there is a small probability of breaking existing setups ...

kael-shipman commented 6 years ago

No problema from me :). Your call!

On Mon, Jun 18, 2018, 19:47 mbucc notifications@github.com wrote:

Excellent, thank you.

I was thinking of simply removing the -(min|max)depth args completely. Then there is no need for a new config option.

Any objection?

I think there is a small probability of breaking existing setups ...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mbucc/shmig/pull/44#issuecomment-398239271, or mute the thread https://github.com/notifications/unsubscribe-auth/ADUIgjOzTUtwu_cIWpNsMTECBu41ckyWks5t-EoOgaJpZM4UrAvz .

kael-shipman commented 6 years ago

(P.s., @mbucc , if you're planning on removing those options, obviously you can just close this PR and I'll delete my branch. I just need to know because I'm going to start using this in production soon ;) )

mbucc commented 6 years ago

I merged locally, so I'll close this PR.

Before pushing to master, I want to make sure I ensure the order of files returned by a find.

This has opened up a kind of interesting corner case: it looks like the order of files returned by find is not guaranteed. Quoting from https://serverfault.com/a/181815:

find will be traversing the directory tree in the order items are stored within the directory entries. This will (mostly) be consistent from run to run, on the same machine and will essentially be "file/directory creation order" if there have been no deletes.

However, some file systems will re-order directory entries as part of compaction operations or when the size of the entry needs to be expanded, so there's always a small chance the "raw" order will change over time. If you want a consistent order, feed the output through an extra sorting stage.

Also note that he is talking about ordering of entries within each directory; I'm not sure what happens about ordering across different directories.

That also means the file name itself makes no difference to find as far as ordering. We could store use the sha1 of the file contents as the filename and find would still return the same order. find $dir -type f -name '*.sql' | xargs ls -Ur would guarantee the order, but xargs has a limit (which varies by system?) on the total number of input files it will handle. It would really suck if your db migration tool stopped working when you went over some magic number of migrations.

kael-shipman commented 6 years ago

Cool! Looking through the source, it doesn't appear to me that it's a problem that find doesn't return ordered results. (Where it matters, it looks like you always pipe it to sort anyway, which makes sense.) Maybe I'm not understanding the problem you've recognized?

mbucc commented 6 years ago

Here's a scenario I just tried with recursive find where the order is broken. For this test, I just removed the (max|min)depth options in find_migrations() function.

Migration data directories:

./migrations
    ./dev
    ./prod

Create one production table:

$ ./shmig -d test.db -t sqlite3 create table1
generated ./migrations/1529579828-table1.sql
$ mv migrations/1529579828-table1.sql migrations/prod/

and one dev data migration:

$ ./shmig -d test.db -t sqlite3 create dev-data
generated ./migrations/1529579922-dev-data.sql
$ mv migrations/1529579922-dev-data.sql migrations/dev/

When I list pending migrations, I expect prod to come first. But it doesn't.

$ ./shmig -d test.db -t sqlite3 pending
1529579922-dev-data.sql
1529579828-table1.sql
$ ./shmig -d test.db -t sqlite3 up
shmig: applying  'dev-data' (1529579922)... done
shmig: applying  'table1'   (1529579828)... done
$ 
kael-shipman commented 6 years ago

Ah, I see, it's ordering on the full path. That seems easy to fix by just creating a secondary array of results with just basename.

Anyway, we really need this tool for our codebases, so I'm going to develop my own branch with the features we need and you and I can sync up afterwards. I'll be in touch!