PecanProject / bety

Web-interface to the Biofuel Ecophysiological Traits and Yields Database (used by PEcAn and TERRA REF)
https://www.betydb.org
BSD 3-Clause "New" or "Revised" License
16 stars 38 forks source link

Strange migration issue with Postgres 9.5.x?? #526

Closed serbinsh closed 2 years ago

serbinsh commented 7 years ago

When working with a colleague using VM version 1.4.8 we observed a strange issue when attempting to update their version of BETY to the latest migration.

==  SitesCultivarsUniqueness: migrating =======================================
-- execute("\n      SELECT setval('sites_cultivars_id_seq', GREATEST(1, CAST(1e9 * 99::int AS bigint)), FALSE);\n\n      ALTER TABLE sites_cultivars ADD CONSTRAINT unique_site_id UNIQUE (site_id);\n    ")
   -> 0.0017s
==  SitesCultivarsUniqueness: migrated (0.0025s) ==============================

/usr/lib/postgresql/9.5/bin/pg_dump: invalid option -- 'i'
Try "pg_dump --help" for more information.
rake aborted!
Error dumping database
/home/carya/bety/vendor/bundle/ruby/2.1.0/gems/activerecord-postgis-adapter-0.6.6/lib/active_record/connection_adapters/postgis_adapter/rails3/databases.rake:208:in `block in <top (required)>'
/home/carya/bety/vendor/bundle/ruby/2.1.0/gems/rgeo-activerecord-0.5.0/lib/rgeo/active_record/task_hacker.rb:61:in `call'
/home/carya/bety/vendor/bundle/ruby/2.1.0/gems/rgeo-activerecord-0.5.0/lib/rgeo/active_record/task_hacker.rb:61:in `call'
/home/carya/bety/vendor/bundle/ruby/2.1.0/gems/activerecord-3.2.22.2/lib/active_record/railties/databases.rake:202:in `block (2 levels) in <top (required)>'
/home/carya/bety/vendor/bundle/ruby/2.1.0/gems/activerecord-3.2.22.2/lib/active_record/railties/databases.rake:196:in `block (2 levels) in <top (required)>'
Tasks: TOP => db:structure:dump
(See full trace by running task with --trace)

Info on the VM version of postgres

carya@pecan64 ~ $ psql --version
psql (PostgreSQL) 9.5.4

However, I do not get this error related to the -i option on our server running 9.14.12. I do also have this entry in the man pages

     -i
       --ignore-version
           A deprecated option that is now ignored.

And it looks like in newer version there is not longer that man page entry…..we aren’t sure, however, where pg_dump comes in during the migration to try and change the option…part of the rake call?

Has anyone else seen this with Postgres 9.5? Is there a work around?

serbinsh commented 7 years ago

In case it wasn't clear, I can run the migration fine on 9.4.x and even run commands like:

pg_dump -d bety -i > dumpy.junk

Without any issues

robkooper commented 7 years ago

This is due to an old version of rails and can be ignored. I believe the version of rails we use and postgresql >= 9.5 have this issue. The flag -i that is no longer supported by postgresql, but this command just creates the sql dump file to create the schema at the end of the migration.

dlebauer commented 7 years ago

@robkooper if it can be ignored, can the issue be closed?

robkooper commented 7 years ago

I think so, but it is up to @serbinsh to make sure the migration finished correctly. See also: https://github.com/rails/rails/issues/23030#issuecomment-171084067

serbinsh commented 7 years ago

Problem is this is a UAF collaborator VM and it isn't working for him. Is there a way to turn that option off when running the command to avoid that error?

Again he is using an older VM 1.48 or something and I don't want to make him have to update or modify the ruby install

Sent from my iPhone

On Jul 24, 2017, at 8:33 AM, Rob Kooper notifications@github.com<mailto:notifications@github.com> wrote:

I think so, but it is up to @serbinshhttps://github.com/serbinsh to make sure the migration finished correctly. See also: rails/rails#23030 (comment)https://github.com/rails/rails/issues/23030#issuecomment-171084067

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/PecanProject/bety/issues/526#issuecomment-317478759, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AC7N6Z4ZFygNugWXoJh0WmyAdqKDE_w5ks5sRMcJgaJpZM4OgMIu.

gsrohde commented 7 years ago

@serbinsh You can check whether the migration succeeded with

bundle exec rake db:migrate:status RAILS_ENV=production

You should see "up" next to each migration listed. It wasn't clear to me whether this pg_dump failure causes the entire migration to be rolled back.

@robkooper, @dlebauer BETYdb officially supports PostgreSQL 9.3. This is what ebi-forecast runs. terra-mepp runs 9.4 and bety6 and 7 run 9.5. 9.3 is the version mentioned in the BETYdb doc here: https://pecan.gitbooks.io/betydb-documentation/content/installing-betydb-web-application.html. I have no objection per se to changing the officially supported version to 9.4 or 9.5, but I'd like to know so that I can test against the later version.

BETYdb runs Rails 3.2, and this version defines the Rake db:structure:dump task to use the -i flag with pg_dump. It's not hard to redefine a Rake task, so this problem, if it is a problem, shouldn't be hard to correct. That said, I've been doing the migrations successfully on bety6 and bety7, so it would seem the db:structure:dump failure doesn't stand in the way of doing a successful migration.

tobeycarman commented 7 years ago

We eventually worked around this issue. We dug around and found the commands to delete the database and then re-instantiate it (on pecan 1.4.8 VM):

$ dropdb -U bety bety
$ sudo -u postgres createdb -O bety bety
$ sudo -u postgres CREATE=YES MYSITE=99 REMOTESITE=0 FIXSEQUENCE=YES USERS=YES pecan/scripts/load.bety.sh

Then in the initial attempt I tried updating first from remote=0 followed by an update from remote=2. This gave the same migration error. Next time around I tried dropping the database, re-instating it, and then skipping the update from remote=0 and updating only from remote=2:

$ sudo -u postgres pecan/scripts/load.bety.sh -m 99 -r 2 -l ~/my-bety-update.log

Which givves the following:

[sudo] password for carya: 
Checking schema                    : MATCHED SCHEMA version 1fad6a9e76205aea2744fb69fcbc9430
Started psql (pid=1984)
Updated  formats                   :      3
Updated  machines                  :      2
Updated  users                     :     16
Updated  citations                 :     12
Updated  covariates                :   1034
Updated  ensembles                 :    293 (+1)
Updated  entities                  :    145
Updated  formats                   :      3
Updated  inputs                    :    134
Updated  machines                  :      2
Updated  methods                   :      1
Updated  models                    :      5
Updated  modeltypes                :      1
Updated  pfts                      :     16
Updated  posteriors                :   1089 (+5)
Updated  priors                    :     28
Updated  runs                      :  21669 (+1)
Updated  sites                     :     10
Updated  species                   :     10
Updated  treatments                :      3
Updated  variables                 :     13
Updated  workflows                 :    203 (+1)
Updated  sitegroups                :      2
ERROR:  new row for relation "dbfiles" violates check constraint "valid_model_refs"
DETAIL:  Failing row contains (2000000003, ed_2.1-opt, /data/software/ED2/ED2.r82, null, null, null, 2000000001, 2015-07-28 15:16:54.202009, 2016-08-03 15:03:54.734726, Model, 1000000003).
CONTEXT:  COPY dbfiles, line 4217: "2000000003  ed_2.1-opt  /data/software/ED2/ED2.r82  \N  \N  \N  2000000001  2015-07-28 15:16:54.202009  2016..."
Updated  traits                    :    816
Updated  citations_sites           :     29
Updated  citations_treatments      :      2
Updated  formats_variables         :     21
Updated  modeltypes_formats        :      1
Updated  pfts_priors               :    227
Updated  pfts_species              :  94559
Updated  posteriors_ensembles      :  16245 (+1)
Updated  sitegroups_sites          :     21

Although I get the error on every update, everything seems to be working ok. And I am even able to update from remote=0 now too (with no errors and no updates, presumably because the remotes 0 and 2 are in sync, and I issued my two update commands back to back).

Here is the output from the above mentioned rails migrate status command:

carya@pecan64 ~/bety (master) $ bundle exec rake db:migrate:status RAILS_ENV=production

database: bety

 Status   Migration ID    Migration Name
--------------------------------------------------
   up     1               ********** NO FILE **********
   up     20130104205059  Add workflow id to ensembles
   up     20130104211901  Add model type to models
   up     20130104211946  Update workflows
   up     20130109205535  Rename user field field
   up     20130222222929  Rename file to container
   up     20130425152503  Change workflows params to text
   up     20130624001504  Change data type for variable
   up     20130629205658  Change revision to string
   up     20130707190720  Add yields and traits views
   up     20130717162614  Update posteriors
   up     20130813212131  Modify traits and yields view
   up     20130829162053  Limit traits and yields view to checked data
   up     20130830184559  Add access level to views
   up     20140418005637  Remove inputs file
   up     20140422155957  Add parent id to pft
   up     20140423220457  Change id to big int
   up     20140506210037  Add and refactor views
   up     20140515205254  Add triggers to check variable ranges
   up     20140521180349  Posteriors tables cleanup
   up     20140604192901  Add fields to variables
   up     20140610210928  Change int to decimal in site
   up     20140617163304  Add db files to model
   up     20140621060009  Add model type to pfts
   up     20140623004229  Add posteriors ensembles
   up     20140624185610  Fix posteriors ensembles
   up     20140708232320  Add geometry to site
   up     20140729045640  Add model type table
   up     20140904220035  Change user id to big int
   up     20140904221818  Use center point for lat lon
   up     20140909212759  Add trait covariates table
   up     20140915153555  Restore default values
   up     20141009160121  Add geometry constraint
   up     20141208165401  Add cultivars pfts
   up     20141211220550  Add foreign key constraints
   up     20150202215147  Add id many to many
   up     20150202220519  Add uniqueness constraints
   up     20150213162341  Add value constraints batch 1
   up     20150313165132  Dbfiles reference constraints
   up     20150521211114  Revise variable value checks
   up     20150624220952  Add sync
   up     20150624222656  Time constraints
   up     20150625184958  Fix id getters
   up     20150904184512  Modify formats runs yields
   up     20151007174432  Add checked to traits and yields view
   up     20151011190026  Add notes user to workflow
   up     20151014182146  Remove dataformat from formats
   up     20160303221049  Change numerics to floats
   up     20160412030352  Add sitegroups
   up     20160523165531  Replace local time with time zone
   up     20160617133217  Add id to table
   up     20160711231257  Create benchmarks tables
   up     20160720182233  Add trait and yield date and time functions
   up     20160930213737  Add experiments table
   up     20161003180105  Extend views
   up     20161005181021  Add score benchmarks ensembles scores
   up     20161129192658  Database change requests
   up     20170118205944  Set experiment table id seq vals
   up     20170415183619  Add sites cultivars
   up     20170712171513  Sites cultivars uniqueness

And here is my current bety git info:

carya@pecan64 ~/bety (master) $ git describe
last_rails_3.0_version-696-g88f35f2
robkooper commented 7 years ago

problem is that you dropped the database and thus lost all of your updates to bety. The error you get is because database 2 depends on records from database 1 and vice versa. Doing the update twice should fix this issue.

tobeycarman commented 7 years ago

Huh. Well not quite sure. Here are the facts that I can recall from my end:

I might get a chance to try the entire process again from the ground up next week (and do a better process of recording the errors), but at the moment I am hesitant to touch it as it is working for my purposes of getting the dvmdostem integration worked out. In the long run, I don't think this will be a problem for us, as I anticipate getting dmvdostem to build on the more recent VM, but for last week while @serbinsh was in Alaska, we just needed something to work.

Hopefully this is helpful for you issue triage efforts! :-)

gsrohde commented 7 years ago

@tobeycarman, @serbinsh

A few things:

If, while doing a migration (rake db:migrate), you see

==  SitesCultivarsUniqueness: migrating =======================================

followed by some commands, followed by

==  SitesCultivarsUniqueness: migrated (0.0418s) ==============================

it means that particular migration succeeded, and since (currently) the SitesCultivarsUniqueness migration is the last one, it means you are up to date, even if later on the output says

/usr/lib/postgresql/9.5/bin/pg_dump: invalid option -- 'i'
Try "pg_dump --help" for more information.
rake aborted!
Error dumping database

Any problems you are having loading data from machines 0 and 2 have nothing to do with the deprecated -i option or the version of PostgreSQL you are using.

But if you create the database from scratch and use the CREATE=YES setting or the -c flag with load.bety.sh, you shouldn't even need to do a migration! (assuming that machine 0 or machine 2 or whichever machine you are importing data from has an up-to-date database schema, which machines 0 and 2 currently do). You will get a copy of the database with all the latest migrations already done.

Here's the real problem: We have constraints in the database to ensure that if a row in one table refers to a row in another table, the row in that other table actually exists. Unfortunately, this doesn't always play well with the load.bety.sh script.

To use the case at hand as an example, the check constraint "valid_model_refs" says that if a row in dbfiles has a value of 'model' in the container_type column, then the number in the container_id column should match the number in the id column of some row in the models table. The line

Failing row contains (2000000003, ed_2.1-opt, /data/software/ED2/ED2.r82, null, null, null, 2000000001, 2015-07-28 15:16:54.202009, 2016-08-03 15:03:54.734726, Model, 1000000003)

in the error output above shows that there was no model row having id number 1000000003. This is probably because that row originated on machine 1 and doesn't get imported from machine 0 or machine 2. My guess is that if you do an import from machine 1, the error you get importing from machine 2 will go away.

This seems to me an unsatisfactory state of affairs. I'm not sure of the best way is to eliminate or at least mitigate these problems.

serbinsh commented 7 years ago

@robkooper @gsrohde @tobeycarman

Success!! we think..... @tobeycarman is sitting with me here and we migrated his old VM's version of BETY, which as you can see from this thread, didn't work until we ran it this way

bundle exec rake db:migrate:status RAILS_ENV=production

This was the only way it would work. We think it worked because the checksum on the webpage updated to Oct 5th with the correct checksum showing @robkooper changes today

I have no idea why it worked that way but not the standard way where we got the -i error...... But good news since now Tobey can keep his db synced.

gsrohde commented 7 years ago

@robkooper Regarding your comment https://github.com/PecanProject/bety/issues/526#issuecomment-318396582, I would think the "disable trigger all" calls in load.bety.sh would sidestep these errors. It does seem to prevent regular foreign-key constraint violations from being reported. Does it only disable some kinds of constraints and not others?

robkooper commented 7 years ago

I think it does not disable foreign key checks.

gsrohde commented 7 years ago

Actually, as far as I can tell, it does disable foreign key checks as well as user-defined triggers but does not disable check constraints or uniqueness constraints.

serbinsh commented 2 years ago

@gsrohde @dlebauer @robkooper I am thinking we can close this, thoughts?