Extremely long sync with some DBs (e.g. BU)

serbinsh commented 2 years ago

Pretty sire I have raised this before but when our DB has gone done for a period of time and I re-enable the sync, syncing with BU takes / can take a very long time (hours). Even during the course of normal operation and syncing it sometimes doesnt finish before my next sync cron job starts.

Does anyone have any suggestions on how they are managing to keep their DBs synced but avoiding jobs not completing before the next? Should I create a separate sync with BU that runs slower than say with WI and UIUC?

Feedback requested.

robkooper commented 2 years ago

for illinois I switched to running it at midnight, that way next morning database is synced. This is something I want to look at in the future (time permitting). Question is do you need to sync so frequently? Probably worth it to check on how frequently things change.

dlebauer commented 2 years ago

Is it mostly a particular (eg runs) table / set of tables? We could consider cleaning out the unnecessary records and / or only syncing those tables infrequently?-- David

--

David LeBauer Director of Data Sciences CALS Comm and Technologies THE UNIVERSITY OF ARIZONA

BSRL, 207 1230 N Cherry Ave | Tucson, AZ 85721 Office: 520-621-4381 | Cell: 760-468-8621 @.***

datascience.cals.arizona.edu orcid | twitter | github | linkedin

robkooper commented 2 years ago

@dlebauer that is a good idea, we can skip the runs table, which will be the biggest set. That is a quick (famous last words) fix.

serbinsh commented 2 years ago

I also will note seeing errors like this on some occasions

---- BU
URL with bety dump                 : https://psql-pecan.bu.edu/sync/dump/bety.tar.gz
Remote start ID                    :  1000000001
Remote end ID                      :  1999999999
Local start ID                     :  2000000001
Local end ID                       :  2999999999

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

The rationale for syncing more is that for collaborative projects (e.g .NASA CMS) across institutions there may be/is a need to be able to share updated posteriors etc. Waiting 24hrs between syncs could be an issue.

I am game to try a more complicated sync where we sync what we can more often and only sync say the runs every 1 day or so.

serbinsh commented 2 years ago

I think so , or at least there is still an issue. Perhaps not the same as the missing file. See below for the sync log. it looks like the syncing still times out before it gets trhough all of the tables

Syncing BETYdb

Mon 24 Jan 2022 03:00:01 AM EST

/data/home/sserbin

---- BU
URL with bety dump                 : https://psql-pecan.bu.edu/sync/dump/bety.tar.gz
Remote start ID                    :  1000000001
Remote end ID                      :  1999999999
Local start ID                     :  2000000001
Local end ID                       :  2999999999
Checking schema                    : MATCHED SCHEMA version 9d0f7330c9ef2f0572c0bbbfa463a59c
Started psql (pid=1535141)
Updated  formats                   :          77
Updated  machines                  :           5
Updated  mimetypes                 :           3
Updated  users                     :          51
Updated  attributes                :        4492
Updated  benchmarks                :          73
Updated  citations                 :         153
Updated  covariates                :          26
Updated  ensembles                 :       31042 (+1)
Updated  inputs                    :      521824 (+398)
Updated  likelihoods               :     4254893
Updated  managements               :           2
Updated  metrics                   :          13
Updated  methods                   :           1
Updated  models                    :          33
Updated  modeltypes                :          15
Updated  pfts                      :         200
Updated  posteriors                :       22611 (+1)
Updated  priors                    :         528
Updated  reference_runs            :         121
Updated  runs                      :     3043167 (+100)
Updated  sites                     :       19486
Updated  species                   :       21022
Updated  treatments                :          10
Updated  variables                 :         381
Updated  workflows                 :       16636 (+2)
Updated  sitegroups                :          27
Updated  dbfiles                   :      624300 (+572)
Updated  traits                    :        1750
Updated  benchmarks_benchmarks_reference_runs :         366
Updated  benchmarks_ensembles      :          84
Updated  benchmarks_ensembles_scores :        3349
Updated  benchmarks_metrics        :         631
Updated  citations_sites           :         254
Updated  citations_treatments      :           7
Updated  formats_variables         :         183
Updated  inputs_runs               :         691
Updated  managements_treatments    :           1
Updated  modeltypes_formats        :          18
Updated  pfts_priors               :        2215
Updated  pfts_species              :      181463
Updated  posteriors_ensembles      :      691855 (+1)
Updated  sitegroups_sites          :       19731

PecanProject / bety

Extremely long sync with some DBs (e.g. BU) #727