opengisch / pum

Postgres Upgrades Manager
GNU General Public License v2.0
30 stars 7 forks source link

How to upgrade db with views #4

Closed marioba closed 6 years ago

marioba commented 7 years ago

For databases with views (and I think is the same for triggers), after we upgrade the table structure we probably need to recreate the views even if they aren't changed so for db with a lot of views (like qwat) we will have a big delta file with a lot of queries for drop and create the views, maybe with only a few real changes in the db structure in the middle.

At the moment the doable solution is to use pre and post delta files. For each delta file, we can have a pre and a post file where drop respectively recreate the views. Pre and post files are applied before respectively after the "main" delta file. For example, if we are applying the file: _delta_1.0.012092017.sql, the pum upgrade command will search if the _delta_1.0.012092017.sql.pre and _delta_1.0.012092017.sql.post files exist and it will apply them before respectively after the main delta file.

So it's doable to drop all the views in the pre file and regenerate them in the post file and if the views aren't changed in the next upgrade we can simply copy and rename the pre and post files to reuse them in the next upgrade.

We need to think and implement a better solution?

marioba commented 7 years ago

@m-kuhn do you want to add something?

m-kuhn commented 7 years ago

Hi @marioba, thanks for raising this.

Have a look at this folder here, there are over 30 files with views. https://github.com/QGEP/datamodel/tree/325c76661f9300a5ec76e7db29c028f4c2edd495/view

Sometimes these views are more than 500 lines long: https://github.com/QGEP/datamodel/blob/325c76661f9300a5ec76e7db29c028f4c2edd495/view/vw_qgep_wastewater_structure.sql

There are also some bits that are not plain .sql, but python scripts that generate sql (https://github.com/QGEP/datamodel/blob/325c76661f9300a5ec76e7db29c028f4c2edd495/scripts/db_setup.sh#L90).

Proposal add a delta_1.0.0_12092017.pre.py and a add a delta_1.0.0_12092017.post.py file where any code can be executed. This way complexity can be moved to the project using pum and removed from pum itself.

m-kuhn commented 7 years ago

As an example, that's more or less what I could imagine.

delta_1.0.0_12092017.pre.sql

DROP VIEW qgep.vw_qgep_wastewater_structure;

delta_1.0.0_12092017.post.py

def run(pum_interface):
    pum_interface.execute_sql('CREATE VIEW ...')
    # or
    from subprocess import Popen, PIPE
    p = Popen(['command', 'and', 'args'], stdout=PIPE, stderr=PIPE, stdin=PIPE)
    output = p.stdout.read()
    pum_interface.execute_sql(output)

    # or even
    import glob
    import os
    files = glob.glob(os.path.join(pum_interface.delta_directory(), 'post-scripts', '*.py')
    for file in files:
        p = Popen(['python', file], stdout=PIPE, stderr=PIPE, stdin=PIPE)
        output = p.stdout.read()
        pum_interface.execute_sql(output)

    pum_interface.report_info('Yay, we applied some new fancy view')
    pum_interface.report_error('Oops, something went terribly wrong')
    raise Exception('We totally fucked it up for you™')
m-kuhn commented 7 years ago

CC @3nids @sylvainbeo

3nids commented 7 years ago

good idea @m-kuhn for the sake of simplicity, I would allow both py and sql to be run.

sylvainbeo commented 7 years ago

@m-kuhn @3nids I'm not sure to well understand the .post.py file. A delta file must be written by someone who don't have any programmation skills. I can't imagine a Qwat user writting your example. Or did i misunderstand something ?

m-kuhn commented 7 years ago

I don't think it's a requirement, that a delta file must be written by someone who doesn't have programming skills.

sylvainbeo commented 7 years ago

@m-kuhn Ok. Agree with the 2 first points. About the last one, are we sure that the delta file is the right place for such logic ? (open question).

m-kuhn commented 7 years ago

Problem statement: we have complicated views that are generated automatically from underlying table structures with some rules (Object Oriented mapping). When the underlying structure changes the view needs to be updated.

Approach: this can be easily regenerated with some logic (python) but requires a lot of (duplicated and uninteresting) SQL.

sylvainbeo commented 7 years ago

Agree with the problem statement. But we discussed about that in the past: if there is already a script that generates the views (or objects related to tables), why duplicate it in the delta ? The script has just to be run again once the upgrade test has been made (?).

I think, i'm missing something here.

m-kuhn commented 7 years ago

If you look at the sample code above, that's what it does. The code to update is not inside the delta-script, the delta-script calls the update script and pipes the output to the sql engine.

sylvainbeo commented 7 years ago

Woa. My bad. Sorry i did not see that.

So maybe the delta.py file should only contain the script name to execute, and the mechanics should be in the pum code ?

m-kuhn commented 7 years ago

So that would be a delta.list file with scripts to execute? Only python scripts or any sort of executable (.sh with shebang... etc.) And whatever they print on stdout will be executed as SQL? How would they receive environment information like which versions delta update is being applied and where the update script is located? How does error reporting work?

sylvainbeo commented 7 years ago

Well, it could be a list of parameters (path, error messages, etc...). I was just trying to avoid the developer to write some code for that.

m-kuhn commented 7 years ago

Yes, the code is quite hard to read, a simpler solution would be nice. What could be done maybe:

marioba commented 7 years ago

In my opinion, we can have normal delta.sql and delta.py files (with pre/post if needed) and the delta.py must extend an hypotetical DeltaPy class with some useful method like:

sylvainbeo commented 7 years ago

In my opinion, we can have normal delta.sql and delta.py files (with pre/post if needed) and the delta.py must extend an hypotetical DeltaPy class with some useful method

Seems a good solution. We have to see if it's worth it to develop this, instead of letting the dev do their own code in the deltas.

marioba commented 7 years ago

Briefly, in summary, based on the comments posted here and on qwat, I imagine the upgrade workflow like this:

execute pre-all.py if exists
execute pre-all.sql if exists

for each file delta_x.x.x_ddmmyyyy.* ordered by version number:
    execute delta_x.x.x_ddmmyyyy.pre.py if exists
    execute delta_x.x.x_ddmmyyyy.pre.sql if exists

    execute delta_x.x.x_ddmmyyyy.py if exists 
    execute delta_x.x.x_ddmmyyyy.sql if exists

    execute delta_x.x.x_ddmmyyyy.post.py if exists
    execute delta_x.x.x_ddmmyyyy.post.sql if exists

execute post-all.py if exists
execute post-all.sql if exists

Do you agree?

m-kuhn commented 7 years ago

Sounds very nice to me!

marioba commented 7 years ago

I implemented the new features as described in my previous message. So is now possible to have python delta and pre-all / post-all files. You can find more info on the pum's README file.

sylvainbeo commented 7 years ago

+1 Sounds pretty good.