jboxberger / synology-gitlab

Updated an improved Original Synology Package
MIT License
129 stars 20 forks source link

Database update/migration failed from v12.9.2-0055 to v13.0.3-0055 #44

Closed helmut-steiner closed 3 years ago

helmut-steiner commented 4 years ago

Hey, I ran the update from GitLab v12.9.2-0055 to v13.0.3-0055 early this morning but it seems like my database was not correctly migrated to the new postgresql version. After the update I was presented with a password reset screen not with my usual login screen and I even after that I couldn't login anymore with my username and password. I had to uninstall the new version, clear the files, reinstall the old version and reload my backups. It looked like it didn't port any of my settings over but the upgrade logs didn't show any abnormalities. Kind regards, Helmut

jboxberger commented 4 years ago

Hi Helmut,

thank you for your Issue, I will investigate this update path. I am very glad that you had Backups and could get it working. Usually you can uninstall the broken version with the checkbox "keep data" checked and reinstall the previous version with the option use existing Data. But Backup is the best and cleanest way. Please keep this procedure in the future and backup before update.

Do yoy have Complex CI Pipelines or big BLOBs ind you Gitlab Instance?

I will give you feedback asap.

helmut-steiner commented 4 years ago

Thanks for the swift reply! I do have a couple of CI pipelines, not sure about the BLOBs. Keeping the data during uninstallation didn't work as after reinstalling the old version it failed to start with an error code. It seemed as it altered the old database as well during the update process.

helmut-steiner commented 4 years ago

Also I still had the old database folder "10" next to the folder "12" and the upgrade log files (most of them empty or one liners) including a lock file in the postgresql folder. So I guess it must have crashed somewhere along the way without an error code.

jboxberger commented 4 years ago

Hello Helmut,

i've tested the Upgrade 2 or 3 times, but i test it with a limited set on my dev System. I could not find any issues. So i think there could be an issue with your data set.

Maybe you can check your LogFiles and compare them with mine.

synology_gitlab_postgresql: synology_gitlab_postgresql.log synology_gitlab: synology_gitlab.log

I've also build you one version with postgres:10 for testing (synology-gitlab-stock-aio-13.0.3-0055.zip) and an updated one so you can test maybe the newer version migrate without issues (synology-gitlab-stock-aio-13.0.5-0055.zip). Please note that i haven't tested the 13.0.5-0055 yet so handle with care and as always, backup, backup, backup :-).

The Download is valid for 7 Days (i recommend chromium for download): https://send.firefox.com/download/b37c3686135896ac/#hLz7dgkyAqrOytmbEc0R0w

Please let me know about your progress, would love to see your issue solved.

helmut-steiner commented 4 years ago

Thanks a lot for your tests and effort! I will give it a shot on the weekend. Can't risk having troubles in the production environment during work hours. :) I will keep you in the loop if it works!

iConMayrhofer commented 4 years ago

Got the same error. Gitlab was installed like new. There where no configurations on the login screen which I made. Also the log says:

Setting up GitLab for firstrun. Please be patient, this could take a while...

jboxberger commented 4 years ago

Thank you for your feedback. I've revoked the release.

There seems to be a problem with the PostgreSQL migration routine within the PostgreSQL container... if you check my logs you will find at the migration at the bottom, could you please post your whole log.

My PostgreSQL Log.

2020-06-07 17:06:57 stdout ‣ Migration in progress. Please be patient...Performing Consistency Checks 2020-06-07 17:06:54 stdout debconf: delaying package configuration, since apt-utils is not installed 2020-06-07 17:06:50 stdout ‣ Installing PostgreSQL 10... 2020-06-07 17:06:50 stdout ‣ Migrating PostgreSQL 10 data to 12... 2020-06-07 17:06:50 stdout Initializing database... 2020-06-07 17:06:50 stdout setfacl: /etc/resolv.conf: Operation not supported 2020-06-07 17:06:50 stdout Setting resolv.conf ACLs... 2020-06-07 17:06:50 stdout Initializing rundir... 2020-06-07 17:06:50 stdout Initializing logdir... 2020-06-07 17:06:50 stdout Initializing certdir... 2020-06-07 17:06:49 stdout Initializing datadir..

Kind regards

iConMayrhofer commented 4 years ago

Here is my log

synology_gitlab_postgresql.log

iConMayrhofer commented 4 years ago

Seems to have done the same, or?

jboxberger commented 4 years ago

Well not really.. your container is shutting down during migration...

mine:
2020-06-08 16:42:46 stdout  Initializing datadir...
2020-06-08 16:42:00 stdout    gitlab
2020-06-08 16:41:52 stdout  Creating dump of database schemas
2020-06-08 16:41:52 stdout  Creating dump of global objects                             ok
2020-06-08 16:41:51 stdout  Checking for invalid "sql_identifier" user columns          ok
2020-06-08 16:41:51 stdout  Checking for tables WITH OIDS                               ok
2020-06-08 16:41:51 stdout  Checking for contrib/isn with bigint-passing mismatch       ok
2020-06-08 16:41:51 stdout  Checking for reg* data types in user tables                 ok
2020-06-08 16:41:51 stdout  Checking for prepared transactions                          ok
2020-06-08 16:41:51 stdout  Checking database connection settings                       ok
2020-06-08 16:41:51 stdout  Checking database user is the install user                  ok
2020-06-08 16:41:46 stdout  Checking cluster versions                                   ok
2020-06-08 16:41:46 stdout  -----------------------------
2020-06-08 16:41:46 stdout  ‣ Migration in progress. Please be patient...Performing Consistency Checks
2020-06-08 16:40:48 stdout  debconf: delaying package configuration, since apt-utils is not installed
2020-06-08 16:40:29 stdout  ‣ Installing PostgreSQL 10...
2020-06-08 16:40:29 stdout  ‣ Migrating PostgreSQL 10 data to 12...
2020-06-08 16:40:29 stdout  Initializing database...
2020-06-08 16:40:29 stdout  setfacl: /etc/resolv.conf: Operation not supported
2020-06-08 16:40:29 stdout  Setting resolv.conf ACLs...
2020-06-08 16:40:29 stdout  Initializing rundir...
2020-06-08 16:40:29 stdout  Initializing logdir...
2020-06-08 16:40:29 stdout  Initializing certdir...
2020-06-08 16:40:25 stdout  Initializing datadir... 
yours
2020-06-07 17:07:51 stdout  Initializing datadir...
2020-06-07 17:07:45 stdout  2020-06-07 17:07:45.211 UTC [1] LOG:  **database system is shut** down
2020-06-07 17:07:43 stdout  2020-06-07 17:07:43.578 UTC [1665] LOG:  shutting down
2020-06-07 17:07:43 stdout  2020-06-07 17:07:43.577 UTC [1] LOG:  background worker "logical replication launcher" (PID 1670) exited with exit code 1
2020-06-07 17:07:43 stdout  2020-06-07 17:07:43.572 UTC [1] LOG:  received smart shutdown request
2020-06-07 17:07:03 stdout  2020-06-07 17:07:03.579 UTC [1] LOG:  database system is ready to accept connections
2020-06-07 17:07:03 stdout  2020-06-07 17:07:03.564 UTC [1664] LOG:  database system was shut down at 2020-06-07 17:07:03 UTC
2020-06-07 17:07:03 stdout  2020-06-07 17:07:03.545 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-06-07 17:07:03 stdout  2020-06-07 17:07:03.536 UTC [1] LOG:  listening on IPv6 address "::", port 5432
2020-06-07 17:07:03 stdout  2020-06-07 17:07:03.536 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2020-06-07 17:07:03 stdout  2020-06-07 17:07:03.536 UTC [1] LOG:  starting PostgreSQL 12.3 (Ubuntu 12.3-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
2020-06-07 17:07:03 stdout  Starting PostgreSQL 12...
2020-06-07 17:07:03 stdout  ‣ Granting access to gitlab_user user...
2020-06-07 17:07:03 stdout  ‣ Loading pg_trgm extension...
2020-06-07 17:07:03 stdout  Creating database: gitlab...
2020-06-07 17:07:03 stdout  Creating database user: gitlab_user
iConMayrhofer commented 4 years ago

Oh, you are right. Missed that

jboxberger commented 4 years ago

I try to find out why... i am using a vm on a ssd as test env. My System is much faster than a usual synology. After the installation the installation script shutdown all container and starts them again, this comes from the original synology package. maybe that's why the migration ist aborted by a container shutdown...

If this theory is correct, the synology-gitlab-stock-aio-13.0.3-0055.zip from here should work fine. It has the Postgres 10 bundled with the 13.0.3. If you feel comfortable with it you can give it a try. https://send.firefox.com/download/b37c3686135896ac/#hLz7dgkyAqrOytmbEc0R0w

i will check this. Thank you very much.

iConMayrhofer commented 4 years ago

I will first try to backup. If i have time after that, I'd be willing to try it.

jboxberger commented 4 years ago

sorry you sent me a too small snippet of your log so a compared the wrong places. could you please send me the full log.

iConMayrhofer commented 4 years ago

I did export this log from the Synology GUI. There wasn't more. Sorry

jboxberger commented 4 years ago

Do you see more in the GUI itself? Maybe the export function truncates something?

In you log there are several Error from the migration itself. The Migration fails, so gitlabs strart with a clean db. But i can not see where the problems beginns because the log starts "in the middle" of the migration process.

2020-06-08 16:45:27 stdout  2020-06-08 16:45:27.647 UTC [177] STATEMENT:  INSERT INTO "application_settings" ("default_projects_limit", "signup_enabled", "gravatar_enabled", "created_at", "updated_at", "restricted_visibility_levels", "import_sources", "default_group_visibility", "repository_checks_enabled", "health_check_access_token", "repository_storages", "sign_in_text_html", "help_page_text_html", "shared_runners_text_html", "after_sign_up_text_html", "dsa_key_restriction", "plantuml_enabled", "unique_ips_limit_per_user", "unique_ips_limit_time_window", "default_artifacts_expire_in", "uuid", "cached_markdown_version", "password_authentication_enabled_for_web", "commit_email_hostname", "runners_registration_token_encrypted", "productivity_analytics_start_date", "id") VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27) RETURNING "id"
2020-06-08 16:45:27 stdout  2020-06-08 16:45:27.647 UTC [177] DETAIL:  Key (id)=(1) already exists.
2020-06-08 16:45:27 stdout  2020-06-08 16:45:27.647 UTC [177] ERROR:  duplicate key value violates unique constraint "application_settings_pkey"
2020-06-08 16:43:59 stdout  2020-06-08 16:43:59.063 UTC [167] STATEMENT:  CREATE DATABASE "gitlab" ENCODING = 'unicode'
2020-06-08 16:43:59 stdout  2020-06-08 16:43:59.063 UTC [167] ERROR:  database "gitlab" already exists
2020-06-08 16:43:43 stdout  2020-06-08 16:43:43.997 UTC [162] STATEMENT:  SELECT "application_settings".* FROM "application_settings" ORDER BY "application_settings"."id" DESC LIMIT $1
2020-06-08 16:43:43 stdout  2020-06-08 16:43:43.997 UTC [162] ERROR:  relation "application_settings" does not exist at character 38
2020-06-08 16:43:43 stdout  2020-06-08 16:43:43.995 UTC [162] STATEMENT:  SELECT "application_settings".* FROM "application_settings" ORDER BY "application_settings"."id" DESC LIMIT $1
2020-06-08 16:43:43 stdout  2020-06-08 16:43:43.995 UTC [162] ERROR:  relation "application_settings" does not exist at character 38
2020-06-08 16:43:43 stdout  2020-06-08 16:43:43.891 UTC [161] STATEMENT:  SELECT "application_settings".* FROM "application_settings" ORDER BY "application_settings"."id" DESC LIMIT $1
2020-06-08 16:43:43 stdout  2020-06-08 16:43:43.891 UTC [161] ERROR:  relation "application_settings" does not exist at character 38
2020-06-08 16:43:43 stdout  2020-06-08 16:43:43.888 UTC [161] STATEMENT:  SELECT "application_settings".* FROM "application_settings" ORDER BY "application_settings"."id" DESC LIMIT $1
2020-06-08 16:43:43 stdout  2020-06-08 16:43:43.888 UTC [161] ERROR:  relation "application_settings" does not exist at character 38
2020-06-08 16:42:56 stdout  2020-06-08 16:42:56.557 UTC [1] LOG:  database system is ready to accept connections
iConMayrhofer commented 4 years ago

No there wasn't more. I also wondered why the log was so short. But I didn't have time to go into it in detail

iConMayrhofer commented 4 years ago

Maybe the GUI itself truncates the log? But that doesn't seem logical. Where are this logs normally saved? I'm not that familar with docker.

jboxberger commented 4 years ago

No i meant the export maybe truncates or shortens it. iv'e copied mine direct from the GUI. Ok whene there is nothing more then i hope that helmut mybe will get some helpfull lines. As far as i can see there ist something in the database that breaks the migration within the PSQL container. I dont think that i can solve the problem directly but maybe i can build a workaround.

Thanks a lot for your informations

jboxberger commented 4 years ago

@iConMayrhofer when you got your backup running, and before the next upograde. You should rename the "12" folder from the psql data folder in something like "12.backup". I am not sure about that but i think this may cause problems when you try to install a working psql 12 package someday.

iConMayrhofer commented 4 years ago

I think i would have time to test the 13.0.3 Version a second time. If that helps.

iConMayrhofer commented 4 years ago

So this are my logs from 13.0.3. Thats all.

synology_gitlab_postgresql.log synology_gitlab.log

jboxberger commented 4 years ago

ok this one says

2020-06-08 21:35:22 stdout  2020-06-08 21:35:22.354 UTC [167] STATEMENT:  CREATE DATABASE "gitlab" ENCODING = 'unicode'
2020-06-08 21:35:22 stdout  2020-06-08 21:35:22.354 UTC [167] ERROR:  database "gitlab" already exists

could you please check wether the folder /volume1/docker/gitlab/postgressql/12 (something like that) exists? This seems to prevent the migration von beening executed.

iConMayrhofer commented 4 years ago

Yeah, there is a 12 Folder

jboxberger commented 4 years ago

I assume you need to rename it before the install... there is the broken "gitlab" database from the previous installation. i am actually not on my dev system to check this, but i am pretty sure that this is the issue why migration says that the database already exist.

iConMayrhofer commented 4 years ago

I deleted the 12 Folder bevor installation. But as Version 13.0.3 uses Postgre 12, it installed it. (If it isn't clear I'm talking about the original 13.0.3. I will test the 13.0.3 with Postgre10 Version now.)

iConMayrhofer commented 4 years ago

Version 13.0.3 with Postgre 10 works. But an Update from there to 13.0.5 doesn't. Migration doesn't seem to work. Here are my logs synology_gitlab_postgresql.log synology_gitlab.log

jboxberger commented 4 years ago

@iConMayrhofer: thanks a lot for your testes. Yes you're absolutely right. the migration from postgres 10 to 12 fails because of your data set.

I can not really do something about it since i can not reproduce the problem with my data set and because of security concerns i can not ask for your data set to test around.

This is The Problem (from your synology_gitlab_postgresql.log):

2020-06-09 05:37:16 stdout  2020-06-09 05:37:16.519 UTC [143] ERROR:  duplicate key value violates unique constraint "application_settings_pkey"

This is Maybe the Solution: https://gitlab.com/gitlab-org/gitlab-foss/-/issues/31976#note_35234678

Since the log is not from the beginning i can only guess that the "application_settings_pkey" is the error, maybe the "application_settings_pkey" is one of the errors, but once it is solved we can watch for the next ones and solve them one by one.

Another Problem from the gitlab container is: 2020-06-09 05:42:00 stdout ActiveRecord::StatementInvalid: PG::DuplicateTable: ERROR: relation "abuse_reports" already exists I am not sure why this error happen. This may happen because the files from a previous migration attempt (/volume1/docker/gitlab/postgresql/12) were not deleted before the upgrade or the error is cause by another issue which is not in the log.

I've build a version for you with postgres 11.8, maybe this one can migrate your dateset correctly, but i do not really believe this because this error is a hard database error. I don't know if the constariant application_settings_pkey is new and why it did not lead t to an error previously, but for now it is what it is and has to be solved. synology-gitlab_synology-gitlab-stock-aio-11.11.0-0053.spk

I try to test around to get some more information's. If you get any new errors please post them here, they are very helpful to get the problem isolated.

Kind Regards

jboxberger commented 4 years ago

I have an idea for a workaround. 1) backup gitlab with psql 10 2) install gitlab with same version but psql 12 3) restore backup from gitlab 4) update to higher versions

If you want to give it a try. Please tell me you current version so i can provide you the same version with psql 12.

Kind Regards

iConMayrhofer commented 4 years ago

I'm on Gitlab 12.9.2-0055

jboxberger commented 4 years ago

Here you go: synology-gitlab-stock-aio-12.9.2-12-0055.spk

iConMayrhofer commented 4 years ago

It worked. I didn't even need to restore my backup. I simply installed it over the old version and everything was there after the Update.

So i will try to install a 13.0.X Version. Should i test the 13.0.5 Version or go with the 13.0.3?

jboxberger commented 4 years ago

Lets stick with the 13.0.3, i will upload the version with psql 12 in a moment for you.

jboxberger commented 4 years ago

Download: synology-gitlab-stock-aio-13.0.3-12-0055.spk

helmut-steiner commented 4 years ago

Thank you guys for getting to the bottom of this issue!! I am looking forward to testing this on the weekend. :)

jboxberger commented 4 years ago

@helmut-steiner before you start, could you please inspect your psql table application_settings for duplicate keys. i need to find a solution, a workaround is great but i can not publish a new release that fails to migrate. I am glad that you two guys have made backups and are familiar with backup, restoring and reinstalling but not everyone using this package has this skills even if its documented.

@iConMayrhofer @helmut-steiner : maybe you can dump your "faulty" application_settings for me (of course replacing the sensitive data) so i can build a check or a fix before installation or something like that.

iConMayrhofer commented 4 years ago

If you tell me the best way to dump my application_settins, i can do it.

jboxberger commented 4 years ago

Ok, a am at work at the moment, tonight i will build you a command to dump it to a sql file. :thumbsup:

iConMayrhofer commented 4 years ago

Ok, Thanks for your help and this repo. Good Work 👍

helmut-steiner commented 4 years ago

I ran into another issue after reinstalling version 12.9.2 and restoring the backup: The runners pages in GitLab are not accessible anymore and throw a 500 error. The backup process doesn't save the secrets.json file and I don't have it in my hyper backup as it is buried somewhere in the NAS folder structure outside of the visible volume. So long story short: I lost the secrets.json file after restoring the backup and cannot access the admin pages for my runners. There is a way to reset the runners but I don't know how to access them in the Synology environment: https://docs.gitlab.com/ee/raketasks/backup_restore.html#reset-runner-registration-tokens There exists a stackoverflow question about this as well: https://stackoverflow.com/questions/54216933/internal-server-error-500-while-accessing-gitlab-admin-runners

How can I access the right console in the NAS environment? Do you know the commands for resetting the runners? Any help is much appreciated!

@jboxberger After solving this, the steps to backup and restore the secrets file should be added to the repo description. What do you think?

jboxberger commented 4 years ago

start dbconsole:

sudo docker exec -it synology_gitlab bash -c "sudo /home/git/gitlab/bin/rails dbconsole -e production"

password is in your environment vars (default: gitlab_pass): grafik

start gitlab shell:

sudo docker exec -it synology_gitlab bash -c "sudo -u git -H bundle exec rails console -e production"

bash into container:

sudo /usr/local/bin/docker exec -it synology_gitlab bash

did this help?

@jboxberger After solving this, the steps to backup and restore the secrets file should be added to the repo description. What do you think?

of course this should!

jboxberger commented 4 years ago

@iConMayrhofer here i got some basic commands for you and as promissed the command for backing up the table.

dump table application_settings

sudo docker exec -i synology_gitlab_postgresql bash -c "sudo -u postgres pg_dump --table public.application_settings gitlab" > /volume1/docker/application_settings.sql

dump database

sudo docker exec -i synology_gitlab_postgresql bash -c "sudo -u postgres pg_dump gitlab" > /volume1/docker/outfile.sql

restore database

cat /volume1/docker/outfile.sql | sudo docker exec -i synology_gitlab_postgresql bash -c "sudo -u postgres psql gitlab"

bash into postgresql container

sudo docker exec -it synology_gitlab_postgresql bash

psql console

sudo -u postgres psql

use gitlab schema

\c gitlab

show tables

\dt

quit

\q

helmut-steiner commented 4 years ago

The dbconsole line did the trick. Thanks Juri! fyi: I had to wait a couple of minutes on my old NAS till the console started up. The first two times I tried I thought it hung somewhere. Once I got my runners set up again I'll dump those application settings.

helmut-steiner commented 4 years ago

I couldn't see any duplicate keys in the settings table but maybe the dump helps you in some way: https://nas.helmutsteiner.net:65502/sharing/ABLN9BOZ8 Link is valid for 7 days.

iConMayrhofer commented 4 years ago

https://icongmbhsrl-my.sharepoint.com/:u:/g/personal/k_mayrhofer_icon_bz_it/ERl4G-z-T3hLj5ukTquXu3EBUJIegZrhl3WpFep5SOhcbw?e=i82rs8 Link is valid for 7 days

jboxberger commented 4 years ago

OK thank you both for providing the example files but i could not locate the exact problem.

@iConMayrhofer: your db version was from 13.0.3 so i could not test the migration from 12.9.2 to 13.0.3

@helmut-steiner: my migration went fine with your application_settings. no errors at all.

So is still need a full log :-(.

Do you guys get the same error with the PSQL 11 version? synology-gitlab-stock-aio-13.0.3-11-0055.spk

iConMayrhofer commented 4 years ago

@jboxberger Oh sorry, I have already deleted everything from version 12.9.2

Do you guys get the same error with the PSQL 11 version? synology-gitlab-stock-aio-13.0.3-11-0055.spk

Sorry for my late reply, if you'd still like to test it, I could try it eventually. I just don't know when

helmut-steiner commented 4 years ago

Hey Juri, I still didn't get the chance to try the PSQL 11 version, but as everything worked on your end with v12 I will try the migration on my second NAS device. I cannot risk the downtime at the moment on my production environment as we are in the middle of intense testing and bugfixing for an upcoming release. Sorry for the inconvenience!

caco3 commented 4 years ago

Is there progress on the issue? Due to some issues with 12.9, I would like to upgrade.

helmut-steiner commented 4 years ago

Hey, I tried the upgrade one more time yesterday on our production environment NAS (which is 7y old by now) and it failed again with the same issue. Restoring the backup from there on my newer NAS and upgrading to the 13.0.3 version there: No issues! It seems like the old NAS' CPU is to slow to handle the database migration process in time or it might be another resource issue leading to this. I backed up the 13.0.3 version from my test NAS and deployed it onto a fresh 13.0.3 installation on the production NAS. Everything is back to normal now. I guess if you got a newer device the upgrade works just fine. We are ordering a new production device next week to be on the safe side for the future.

IMPORTANT NOTES: The upgrade process overwrites the secret keys in the docker container configuration. (Sidenote at @jboxberger: They should not be overwritten during the upgrade process.) Be sure to save the secrets before upgrading if you want to restore a backup with configured runners!!! You can also copy the secrets.yml and gitlab.yml files to the backup folder using following commands:

sudo /usr/local/bin/docker exec -it synology_gitlab bash
cp config/gitlab.yml ../data/backups/
cp config/secrets.yml ../data/backups/

or copy the whole config folder to your backup folder and restore it after restoring the backup.

Kind regards, Helmut