Open robkooper opened 4 years ago
Ok, I ran the command and am running a dump and sync now.
Any improvement?
In the meantime, Bailey UW has had new issues with key constraint issues with our database. Sharing the guts o th email, since I wonder if it is related. Curious your thoughts @robkooper @istfer or @dlebauer ? " I'm getting the error 'duplicate key violates unique constraint' when setting up the runs & run configs for the knots in the beginning of the pda.emulator workflow (my.write.config() in pda.init.run()): Error in postgresqlExecStatement(conn, statement, ...) : RS-DBI driver: (could not Retrieve the result : ERROR: duplicate key value violates unique constraint "unique_time_interval_per_model_site_parameter_list_and_ensemble" DETAIL: Key (model_id, site_id, start_time, finish_time, parameter_list, ensemble_id)=(5000000001, 678, 2008-01-01 00:00:00, 2008-12-31 00:00:00, 5000000408.knot.1, 5000000408) already exists.
Looking into the postgresql db though (in the 'runs' table), the key does NOT already exist...the last run entered is run ID 5000039715, which was one of the ensemble runs for workflow ID 5000000254 (the full year NEE sensitivity run I did), that single run finished on 3/13 and is a little over halfway through the ensemble runs for that workflow ID, the full ensemble finished on 3/18. The remaining half of the runs from that ensemble workflow apparently never synced and got added to the postgresql db, nor have any of the runs from the other ensemble run I started on 3/19 (for LE sensitivity, which is also ~halfway through now).
It looks like this error pops up sometimes when the primary key sequence in a given table has somehow become out of sync, sometimes as a result of running a big import or something. It looks like the solution is to manually reset the primary key index, but first you have to restore from a dump file (no idea)
[...]
-The commands I found for manually resetting are for keys that are unique number sequences, so you can just use max/nextval/setval commands...these keys are combinations of things, so more difficult...could use run ID as the filtering value but we'd have to backup the previous ones that didn't get added before being able to do this?
I think that if the previous keys/runs are backed up then the next key entry (for the run I'm setting up now) would be sequential, and it might be a non-issue and we won't have to figure out how to manually reset. Hopefully. The runs that are missing went smoothly and all the associated info is present in the server files/R studio, so everything's there, just needs to be added to the db. "
Looking at the error it says that unique_time_interval_per_model_site_parameter_list_and_ensemble
fails. Which means that there is already a row in the database with the exact same (model_id, site_id, start_time, finish_time, parameter_list, ensemble_id).
I think this is the row that throws the error:
bety=# select * from runs where ensemble_id = 5000000408;
id | model_id | site_id | start_time | finish_time | outdir | outprefix | setting | parameter_list |
created_at | updated_at | started_at | finished_at | ensemble_id
------------+------------+---------+---------------------+---------------------+------------------------------------------+-----------+---------+-------------------+-
---------------------------+----------------------------+------------+-------------+-------------
5000040491 | 5000000001 | 678 | 2008-01-01 00:00:00 | 2008-12-31 00:00:00 | /home/carya/output//PEcAn_5000000254/out | | | 5000000408.knot.1 |
2020-03-24 00:00:42.499183 | 2020-03-24 00:00:42.499183 | | | 5000000408
This is to prevent multiple runs in the same ensemble with the same parameters etc. I guess if you want to force a rerun of that specific ensemble you need to remove the run and the data.
Is Bailey running the emulator code line by line?
This is the only time (when I run code line by line and try to re-initiate the run batch without getting a new ensemble id) I encountered this error and like Rob said it was due to duplicate records. If she is running line-by-line she can pass, con=NULL
to the pda.init.run function, I did this often if I'm just testing. Alternatively, she can get a new ensemble.id
by rerunning this line. I haven't seen it outside of this context.
Thanks both. Will work with Bailey to try the solution above or remove the offending row. May temporarily stop syncing until the upcoming migration.
@istfer yes, am running the emulator code line by line trying to figure out an issue w/invalid path arguments I'm getting from my.write.configs in pda.init.run, passed con=NULL to pda.init.run and it solved the unique key error at least, thank you!!
This issue is stale because it has been open 365 days with no activity.
Just loaded the database from site 5 and found the following row in dbfiles prevents the dump from being loaded:
Once this one row is removed the database is loaded correctly.
it should be possible to delete this row from the database using: