ecamp / ecamp3

eCamp v3 is a web-based app for camp and course planning. The application is specialized for camps and courses of youth associations and for Y+S offers in the sport of camp sports/trekking.
https://ecamp3.ch
GNU Affero General Public License v3.0
113 stars 50 forks source link

Improve dev-data migration #4192

Open manuelmeister opened 11 months ago

manuelmeister commented 11 months ago
DeNic0la commented 11 months ago

Hey, I would like to do that. I think there are a lot of possible solutions and no one stands out to me as 'the best way'. I made a top 4 list and would like some feedback or opinions.

Sort Data in DB I am assuming the order of pg_dump is based on the way the data was written to the DB. Therefore a consistent write order should result in a consistent pg_dump order. I was hoping i could use materialized views for this but as it turns out pg_dump doesn't work on any views. However you can easily create a table instead of a View based on a Select with an Order by. (Alternatively you could combine this with the CSV-Approach). So by creating prefixed tables based on ordered data and then exporting it, it should output the data sorted (and you can drop the tables easily). However, my understanding of (Pg)Sql is limited. Everything you just read may be wrong. \+ Very simple \+ Low Maintenance \+ sort by created-at would always place newly added rows at the bottom \- Might not work the way I want it to.
Post-Dump-Sorting Sort the data after the export (meaning when it's already in the data.sql file). Since we use Git you could probably identify the lines added and move them. Eighter split into one file per table and move them to the end or identify the end of the table in the .SQL file and move them there. The one file per table approach would solve the issue that the order in which the tables are listed changes. \- might not work when there is an additional column (since then, there would be changes in all lines) I found out that there are already some tools like [pgtricks](https://github.com/akaihola/pgtricks) that sort the Dump - I am not sure about licensing and if it even works.
Data Export Script A script that connects to the DB and runs a sorted select on each table, then generates insert statements for that table (using something like string templates). I am used to writing stuff like this in python or java - if you want it in PHP it would probably be a lot better if someone else did that. \+ Full controll \+ Technology Agnostic (doesn't rely on pg_dump, if there would ever be a switch to a different DB it should still work) \- I have a bad feeling about this (it will probably be hell to maintain) \- Poor Performance
Export as CSV In PGSQL you can Copy from and to csv files. When selecting there is the option to sort the table and that's what this approach would do. A Major change would be that the Data is now in CSV format. It would have a worse performance than the current solution but I doubt there is a noticeable difference. \+ Could easily use other tools on the CSV data \+ Simple (already built into PgSql)