nextcloud / server

☁️ Nextcloud server, a safe home for all your data
https://nextcloud.com
GNU Affero General Public License v3.0
27.19k stars 4.04k forks source link

[Bug]: Upgrade from 24 to 25 can't connect to Postgres DB #34679

Closed e-caste closed 1 year ago

e-caste commented 2 years ago

⚠️ This issue respects the following points: ⚠️

Bug description

I can't complete the upgrade from 24 to 25 due to some database connection error. I can confirm the database is accessible, since running any occ db: command works as expected (also, I haven't changed the configuration that worked for 24).

Steps to reproduce

After changing the label of my Docker container from 24-fpm-alpine to 25-fpm-alpine, the database gets updated and everything seems to work, apps are updated, maintenance mode is enabled. When I try to open the web interface, it says that maintenance mode is still enabled. So I run occ upgrade, which unfortunately outputs this:

Nextcloud or one of the apps require upgrade - only a limited number of commands are available
You may use your browser or the occ upgrade command to do the upgrade
Setting log level to debug
Updating database schema
Updated database
Disabled incompatible app: breezedark
Disabled incompatible app: fulltextsearch
Disabled incompatible app: onlyoffice
Disabled incompatible app: unsplash
Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
Repair error: An exception occurred while executing a query: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
Repair warning: Unable to clear the frontend cache
Repair warning: Unable to clear the avatar cache
Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
Doctrine\DBAL\Exception\DriverException: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
Update failed
Maintenance mode is kept active
Resetting log level

I have also tried upgrading my Postgres database from 13 to 15, but that doesn't seem to solve anything as I'm presented with the same error.

Expected behavior

The upgrade should complete successfully.

Installation method

Community Docker image

Operating system

Debian/Ubuntu

PHP engine version

PHP 8.1

Web server

Nginx

Database engine version

PostgreSQL

Is this bug present after an update or on a fresh install?

Updated to a major version (ex. 22.2.3 to 23.0.1)

Are you using the Nextcloud Server Encryption module?

Encryption is Disabled

What user-backends are you using?

Configuration report

Nextcloud or one of the apps require upgrade - only a limited number of commands are available
You may use your browser or the occ upgrade command to do the upgrade
{
    "system": {
        "passwordsalt": "***REMOVED SENSITIVE VALUE***",
        "secret": "***REMOVED SENSITIVE VALUE***",
        "trusted_domains": [
            "***REMOVED SENSITIVE VALUE***"
        ],
        "datadirectory": "***REMOVED SENSITIVE VALUE***",
        "dbtype": "pgsql",
        "dbname": "***REMOVED SENSITIVE VALUE***",
        "dbhost": "***REMOVED SENSITIVE VALUE***",
        "dbport": "",
        "dbtableprefix": "***REMOVED SENSITIVE VALUE***",
        "dbuser": "***REMOVED SENSITIVE VALUE***",
        "dbpassword": "***REMOVED SENSITIVE VALUE***",
        "installed": true,
        "log_type": "file",
        "logfile": "\/var\/log\/nextcloud\/nextcloud.log",
        "loglevel": "2",
        "mail_smtpmode": "sendmail",
        "remember_login_cookie_lifetime": "7200",
        "log_rotate_size": "10485760",
        "trashbin_retention_obligation": "auto, 180",
        "versions_retention_obligation": "auto, 365",
        "simpleSignUpLink.shown": "false",
        "memcache.local": "\\OC\\Memcache\\APCu",
        "memcache.locking": "\\OC\\Memcache\\Redis",
        "filelocking.enabled": true,
        "redis": {
            "host": "***REMOVED SENSITIVE VALUE***",
            "port": 6379
        },
        "logtimezone": "Europe\/Rome",
        "htaccess.RewriteBase": "\/",
        "enable_previews": true,
        "enabledPreviewProviders": {
            "11": "OC\\Preview\\PNG",
            "12": "OC\\Preview\\JPEG",
            "13": "OC\\Preview\\GIF",
            "14": "OC\\Preview\\BMP",
            "15": "OC\\Preview\\MarkDown",
            "16": "OC\\Preview\\MP3",
            "17": "OC\\Preview\\TXT",
            "18": "OC\\Preview\\Movie"
        },
        "preview_max_x": "2048",
        "preview_max_y": "2048",
        "jpeg_quality": "60",
        "maintenance": true,
        "app_install_overwrite": [
            "drawio",
            "end_to_end_encryption",
            "files_external_dropbox",
            "unsplash",
            "groupfolders",
            "onlyoffice",
            "twofactor_webauthn",
            "breezedark",
            "deck",
            "fulltextsearch"
        ],
        "overwrite.cli.url": "***REMOVED SENSITIVE VALUE***",
        "overwriteprotocol": "https",
        "apps_paths": [
            {
                "path": "\/var\/www\/html\/apps",
                "url": "\/apps",
                "writable": false
            },
            {
                "path": "\/var\/www\/html\/custom_apps",
                "url": "\/custom_apps",
                "writable": true
            }
        ],
        "default_phone_region": "IT",
        "instanceid": "***REMOVED SENSITIVE VALUE***",
        "theme": "",
        "version": "24.0.6.1",
        "mail_sendmailmode": "pipe",
        "mail_from_address": "***REMOVED SENSITIVE VALUE***",
        "mail_domain": "***REMOVED SENSITIVE VALUE***"
    }
}

List of activated Apps

Nextcloud or one of the apps require upgrade - only a limited number of commands are available
You may use your browser or the occ upgrade command to do the upgrade
Enabled:
  - activity: 2.17.0
  - admin_audit: 1.15.0
  - breezedark: 24.0.2
  - calendar: 4.0.1
  - circles: 25.0.0
  - cloud_federation_api: 1.8.0
  - comments: 1.15.0
  - contacts: 5.0.1
  - contactsinteraction: 1.6.0
  - dashboard: 7.5.0
  - dav: 1.24.0
  - federatedfilesharing: 1.15.0
  - files: 1.20.1
  - files_external: 1.17.0
  - files_pdfviewer: 2.6.0
  - files_rightclick: 1.4.0
  - files_sharing: 1.17.0
  - files_trashbin: 1.15.0
  - files_versions: 1.18.0
  - forms: 3.0.0
  - fulltextsearch: 24.0.0
  - groupfolders: 13.0.0
  - integration_dropbox: 1.0.5
  - integration_github: 1.0.12
  - integration_google: 1.0.8
  - integration_reddit: 1.0.5
  - logreader: 2.10.0
  - lookup_server_connector: 1.13.0
  - maps: 0.2.1
  - nextcloud_announcements: 1.14.0
  - notes: 4.6.0
  - notifications: 2.13.1
  - oauth2: 1.13.0
  - onlyoffice: 7.5.4
  - password_policy: 1.15.0
  - photos: 2.0.0
  - previewgenerator: 5.1.0
  - privacy: 1.9.0
  - provisioning_api: 1.15.0
  - recommendations: 1.4.0
  - related_resources: 1.0.1
  - serverinfo: 1.15.0
  - settings: 1.7.0
  - spreed: 15.0.0
  - support: 1.8.0
  - systemtags: 1.15.0
  - text: 3.6.0
  - theming: 2.0.0
  - twofactor_backupcodes: 1.14.0
  - twofactor_totp: 7.0.0
  - unsplash: 2.0.1
  - updatenotification: 1.15.0
  - user_status: 1.5.0
  - viewer: 1.9.0
  - weather_status: 1.5.0
  - workflowengine: 2.7.0
Disabled:
  - bruteforcesettings
  - deck: 1.7.1
  - drawio: 1.0.3
  - encryption
  - external: 4.0.0
  - extract: 1.3.5
  - federation: 1.11.0
  - files_external_dropbox: 1.4.3
  - files_fulltextsearch: 24.0.1
  - firstrunwizard: 2.13.0
  - fulltextsearch_elasticsearch: 24.0.1
  - issuetemplate: 0.7.0
  - mail: 1.14.1
  - metadata: 0.16.0
  - news: 18.2.0
  - podcast: 0.3.1
  - sharebymail: 1.12.0
  - survey_client: 1.12.0
  - suspicious_login
  - twofactor_webauthn: 0.3.3
  - user_ldap

Nextcloud Signing status

Can't login to Nextcloud.

Nextcloud Logs

data/nextcloud.log does not exist.

Additional info

No response

szaimen commented 2 years ago

Can you try the upgrade again without upgrading to postgresql 15?

e-caste commented 2 years ago

@szaimen I have upgraded first with Postgres 13, then trying to fix the error I tried upgrading to Postgres 15 (updating the database), but I got the same error.

e-caste commented 2 years ago

@szaimen a few more details:

Now, as far as I understand from the log above, the issue is with the Repair step of the Upgrade process. Should I try commenting these lines or is it unsafe?
Also, I don't understand why the connection to the DB seems to generally work, but stops working for these specific Repair steps.
This is all still done with the postgres:13-alpine image, no Postgres 15 involved.

This was a perfectly working installation of Nextcloud 24, used daily, so I really would like to understand how I can proceed with the upgrade to 25.

szaimen commented 2 years ago

Do you have something in the database logs that could help identifying the issue?

e-caste commented 2 years ago

@szaimen there you go, the part containing the only WARNING of the full Postgres log after running occ upgrade:

nextcloud-db-1  | 2022-10-20 17:01:17.252 UTC [28] DETAIL:  parameters: $1 = '19', $2 = 'c5fee126eaa8e0bcbcb2c43a9c05d123'
nextcloud-db-1  | 2022-10-20 17:01:18.517 UTC [28] LOG:  execute <unnamed>: SELECT "id" FROM "oc_mimetypes" WHERE "mimetype" = $1
nextcloud-db-1  | 2022-10-20 17:01:18.517 UTC [28] DETAIL:  parameters: $1 = 'httpd/unix-directory'
nextcloud-db-1  | 2022-10-20 17:01:18.517 UTC [28] LOG:  execute <unnamed>: SELECT "id" FROM "oc_mimetypes" WHERE "mimetype" = $1
nextcloud-db-1  | 2022-10-20 17:01:18.517 UTC [28] DETAIL:  parameters: $1 = 'application/vnd.openxmlformats-officedocument.wordprocessingml.document.oform'
nextcloud-db-1  | 2022-10-20 17:01:18.519 UTC [28] LOG:  execute <unnamed>: UPDATE "oc_filecache" SET "mimetype" = $1 WHERE ("mimetype" <> $2) AND ("mimetype" <> $3) AND ("name" ILIKE $4)
nextcloud-db-1  | 2022-10-20 17:01:18.519 UTC [28] DETAIL:  parameters: $1 = '123', $2 = '123', $3 = '2', $4 = '%.oform'
nextcloud-db-1  | 2022-10-20 17:01:20.277 UTC [1] LOG:  server process (PID 28) was terminated by signal 11: Segmentation fault
nextcloud-db-1  | 2022-10-20 17:01:20.277 UTC [1] DETAIL:  Failed process was running: UPDATE "oc_filecache" SET "mimetype" = $1 WHERE ("mimetype" <> $2) AND ("mimetype" <> $3) AND ("name" ILIKE $4)
nextcloud-db-1  | 2022-10-20 17:01:20.277 UTC [1] LOG:  terminating any other active server processes
nextcloud-db-1  | 2022-10-20 17:01:20.277 UTC [25] WARNING:  terminating connection because of crash of another server process
nextcloud-db-1  | 2022-10-20 17:01:20.277 UTC [25] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
nextcloud-db-1  | 2022-10-20 17:01:20.277 UTC [25] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
nextcloud-db-1  | 2022-10-20 17:01:20.280 UTC [1] LOG:  all server processes terminated; reinitializing
nextcloud-db-1  | 2022-10-20 17:01:20.316 UTC [29] LOG:  database system was interrupted; last known up at 2022-10-20 17:00:32 UTC
nextcloud-db-1  | 2022-10-20 17:01:20.666 UTC [29] LOG:  database system was not properly shut down; automatic recovery in progress
nextcloud-db-1  | 2022-10-20 17:01:20.695 UTC [29] LOG:  redo starts at 23/93C5C30
nextcloud-db-1  | 2022-10-20 17:01:20.696 UTC [29] LOG:  invalid record length at 23/93D7820: wanted 24, got 0
nextcloud-db-1  | 2022-10-20 17:01:20.696 UTC [29] LOG:  redo done at 23/93D77F8
nextcloud-db-1  | 2022-10-20 17:01:20.749 UTC [1] LOG:  database system is ready to accept connections

Looks like the entire Postgres process segfaults 🤔 What exact version of Postgres are you using in your automated tests?

szaimen commented 2 years ago

Hm... I am honestly not sure if this is really a Nextcloud issue. Can you try to create a new satabase and restore a database dump and check if the upgrade works with that?

e-caste commented 2 years ago

@szaimen yes and no... Nextcloud 25 launches after messing around with quite a few commands to dump and restore the DB, but then occ only presents the option to run occ maintenance:install and the instance is new, without any previous data.
I have

  1. run docker exec -t <dbcontainer> pg_dump --clean --username=<dbuser> <dbname> > dump_$(date +%d-%m-%Y"_"%H_%M_%S).sql
  2. created a new Docker volume for the DB
  3. removed data/config/config.php, which needs to be automatically created
  4. in docker-compose.yml, set the Postgres container to init the DB from the dump by mounting it in the correct directory:
    volumes:
      - db:/var/lib/postgresql/data
      - ./dump_20-10-2022_19_32_30.sql:/docker-entrypoint-initdb.d/dump.sql
  5. run occ maintenance:install --database-name <dbname> --database-user <dbuser> --database-pass <dbpass> --admin-user <adminuser> --admin-pass <adminpass>

Am I missing a final step here? Or something in between? Running any occ db: command or occ maintenance:repair yields no result, so I guess something is wrong with the dump creation process. Using dumpall (docker exec -t <dbcontainer> pg_dumpall --clean --username=<dbuser> --database=<dbname> > dump_$(date +%d-%m-%Y"_"%H_%M_%S).sql) doesn't work at all, since the dump tries to recreate the DB which apparently Nextcloud has already created.

e-caste commented 1 year ago

The plot thickens: I have restored a previous backup of my VM where I still hadn't updated Nextcloud 24 to 25 at all. That install worked perfectly. But now, if I try to occ, Postgres logs:

2022-10-21T09:00:17.918374470Z 2022-10-21 09:00:17.918 UTC [1] LOG:  starting PostgreSQL 13.7 on x86_64-pc-linux-musl, compiled by gcc (Alpine 11.2.1_git2022022-10-21T09:00:17.919950659Z 2022-10-21 09:00:17.918 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2022-10-21T09:00:17.919966793Z 2022-10-21 09:00:17.918 UTC [1] LOG:  listening on IPv6 address "::", port 5432
2022-10-21T09:00:17.923931215Z 2022-10-21 09:00:17.923 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2022-10-21T09:00:17.929548413Z 2022-10-21 09:00:17.929 UTC [21] LOG:  database system was shut down at 2022-10-21 08:57:30 UTC
2022-10-21T09:00:17.934170546Z 2022-10-21 09:00:17.934 UTC [1] LOG:  database system is ready to accept connections
2022-10-21T09:00:40.053451596Z 2022-10-21 09:00:40.053 UTC [28] LOG:  statement: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED
2022-10-21T09:00:40.316788554Z 2022-10-21 09:00:40.316 UTC [1] LOG:  server process (PID 28) was terminated by signal 11: Segmentation fault
2022-10-21T09:00:40.316836813Z 2022-10-21 09:00:40.316 UTC [1] DETAIL:  Failed process was running: SELECT * FROM "oc_appconfig"
2022-10-21T09:00:40.316847859Z 2022-10-21 09:00:40.316 UTC [1] LOG:  terminating any other active server processes
2022-10-21T09:00:40.318023739Z 2022-10-21 09:00:40.317 UTC [25] WARNING:  terminating connection because of crash of another server process
2022-10-21T09:00:40.318039009Z 2022-10-21 09:00:40.317 UTC [25] DETAIL:  The postmaster has commanded this server process to roll back the current transacti2022-10-21T09:00:40.318048456Z 2022-10-21 09:00:40.317 UTC [25] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2022-10-21T09:00:40.319653764Z 2022-10-21 09:00:40.319 UTC [1] LOG:  all server processes terminated; reinitializing
2022-10-21T09:00:40.343431053Z 2022-10-21 09:00:40.343 UTC [29] LOG:  database system was interrupted; last known up at 2022-10-21 09:00:17 UTC
2022-10-21T09:00:40.597943211Z 2022-10-21 09:00:40.597 UTC [29] LOG:  database system was not properly shut down; automatic recovery in progress
2022-10-21T09:00:40.601910344Z 2022-10-21 09:00:40.601 UTC [29] LOG:  redo starts at 23/5B63410
2022-10-21T09:00:40.601941906Z 2022-10-21 09:00:40.601 UTC [29] LOG:  invalid record length at 23/5B63448: wanted 24, got 0
2022-10-21T09:00:40.601954215Z 2022-10-21 09:00:40.601 UTC [29] LOG:  redo done at 23/5B63410
2022-10-21T09:00:40.618168195Z 2022-10-21 09:00:40.618 UTC [1] LOG:  database system is ready to accept connections

so essentially SELECT * FROM "oc_appconfig" causes a Postgres segfault, which is funny I guess. But also terrible.
I've checked if the physical disk shows any errors, and it doesn't look like it.

If I remove data/config/config.php, like this post seems to suggest, then docker compose down and docker compose up -d, occ works, but only allows me to maintenance:install (I've found out that a config.php gets indeed auto-generated, but it's empty, hence useless).

What is wrong with my config.php?

<?php
$CONFIG = array (
  'passwordsalt' => '***REMOVED SENSITIVE VALUE***',
  'secret' => '***REMOVED SENSITIVE VALUE***,
  'trusted_domains' =>
  array (
    ***REMOVED SENSITIVE VALUE***
  ),
  'datadirectory' => '/var/www/html/data',
  'dbtype' => 'pgsql',
  'dbname' => 'nextcloud_db',
  'dbhost' => 'db:5432',
  'dbport' => '',
  'dbtableprefix' => 'oc_',
  'dbuser' => '***REMOVED SENSITIVE VALUE***',
  'dbpassword' => '***REMOVED SENSITIVE VALUE***',
  'installed' => true,
  'log_type' => 'file',
  'logfile' => '/var/log/nextcloud/nextcloud.log',
  'loglevel' => '2',
  'mail_smtpmode' => 'sendmail',
  'remember_login_cookie_lifetime' => '7200',
  'log_rotate_size' => '10485760',
  'trashbin_retention_obligation' => 'auto, 180',
  'versions_retention_obligation' => 'auto, 365',
  'simpleSignUpLink.shown' => 'false',
  'memcache.local' => '\\OC\\Memcache\\APCu',
  'memcache.locking' => '\\OC\\Memcache\\Redis',
  'filelocking.enabled' => true,
  'redis' =>
  array (
    'host' => 'redis',
    'port' => 6379,
  ),
  'logtimezone' => 'Europe/Rome',
  'htaccess.RewriteBase' => '/',
  'enable_previews' => true,
  'enabledPreviewProviders' =>
  array (
    11 => 'OC\\Preview\\PNG',
    12 => 'OC\\Preview\\JPEG',
    13 => 'OC\\Preview\\GIF',
    14 => 'OC\\Preview\\BMP',
    15 => 'OC\\Preview\\MarkDown',
    16 => 'OC\\Preview\\MP3',
    17 => 'OC\\Preview\\TXT',
    18 => 'OC\\Preview\\Movie',
  ),
  'preview_max_x' => '2048',
  'preview_max_y' => '2048',
  'jpeg_quality' => '60',
  'maintenance' => false,
  'app_install_overwrite' =>
  array (
    0 => 'drawio',
    1 => 'end_to_end_encryption',
    2 => 'files_external_dropbox',
    3 => 'unsplash',
    4 => 'groupfolders',
    5 => 'onlyoffice',
    6 => 'twofactor_webauthn',
    7 => 'breezedark',
    8 => 'deck',
    9 => 'fulltextsearch',
  ),
  'overwrite.cli.url' => '***REMOVED SENSITIVE VALUE***',
  'overwriteprotocol' => 'https',
  'apps_paths' =>
  array (
    0 =>
    array (
      'path' => '/var/www/html/apps',
      'url' => '/apps',
      'writable' => false,
    ),
    1 =>
    array (
      'path' => '/var/www/html/custom_apps',
      'url' => '/custom_apps',
      'writable' => true,
    ),
  ),
  'default_phone_region' => 'IT',
  'instanceid' => '***REMOVED SENSITIVE VALUE***',
  'theme' => '',
  'version' => '24.0.6.1',
  'mail_sendmailmode' => 'pipe',
  'mail_from_address' => 'no-reply',
  'mail_domain' => '***REMOVED SENSITIVE VALUE***',
);
e-caste commented 1 year ago

Nextcloud 24 IS ALIVE! Had to specify postgres:13.8-alpine as the image, as I was using postgres:13.7-alpine.
I'll keep this issue open for when I try to update again to 25, in case the same bug presents itself again.

For completeness, I still have a few issues with the DB at startup, but at least they don't prevent me from running the whole Nextcloud. Postgres logs:

2022-10-21T09:47:46.198399620Z
│2022-10-21T09:47:46.198472235Z PostgreSQL Database directory appears to contain a database; Skipping initialization
│2022-10-21T09:47:46.198482227Z
│2022-10-21T09:47:46.235984675Z 2022-10-21 09:47:46.235 UTC [1] LOG:  starting PostgreSQL 13.8 on x86_64-pc-linux-musl, compiled by gcc (Alpine 11.2.1_git202│2022-10-21T09:47:46.236275877Z 2022-10-21 09:47:46.236 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
│2022-10-21T09:47:46.236599293Z 2022-10-21 09:47:46.236 UTC [1] LOG:  listening on IPv6 address "::", port 5432
│2022-10-21T09:47:46.241011479Z 2022-10-21 09:47:46.240 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
│2022-10-21T09:47:46.247230386Z 2022-10-21 09:47:46.247 UTC [20] LOG:  database system was shut down at 2022-10-21 09:47:18 UTC
│2022-10-21T09:47:46.252406056Z 2022-10-21 09:47:46.252 UTC [1] LOG:  database system is ready to accept connections
│2022-10-21T09:50:04.747809670Z 2022-10-21 09:50:04.747 UTC [117] ERROR:  invalid page in block 75871 of relation base/16384/16826
│2022-10-21T09:50:04.747919204Z 2022-10-21 09:50:04.747 UTC [117] STATEMENT:  SELECT "a"."name" FROM "oc_filecache" "a" LEFT JOIN "oc_filecache" "b" ON CAST(│2022-10-21T09:50:04.748027361Z 2022-10-21 09:50:04.747 UTC [116] ERROR:  invalid page in block 75871 of relation base/16384/16826
│2022-10-21T09:50:04.748072707Z 2022-10-21 09:50:04.747 UTC [116] CONTEXT:  parallel worker
│2022-10-21T09:50:04.748086050Z 2022-10-21 09:50:04.747 UTC [116] STATEMENT:  SELECT "a"."name" FROM "oc_filecache" "a" LEFT JOIN "oc_filecache" "b" ON CAST(│2022-10-21T09:50:04.754514445Z 2022-10-21 09:50:04.754 UTC [1] LOG:  background worker "parallel worker" (PID 117) exited with exit code 1
│2022-10-21T09:50:05.022775257Z 2022-10-21 09:50:05.022 UTC [1] LOG:  background worker "parallel worker" (PID 118) was terminated by signal 11: Segmentation│2022-10-21T09:50:05.022835933Z 2022-10-21 09:50:05.022 UTC [1] DETAIL:  Failed process was running: SELECT "a"."name" FROM "oc_filecache" "a" LEFT JOIN "oc_│2022-10-21T09:50:05.022851340Z 2022-10-21 09:50:05.022 UTC [1] LOG:  terminating any other active server processes
│2022-10-21T09:50:05.023250042Z 2022-10-21 09:50:05.022 UTC [24] WARNING:  terminating connection because of crash of another server process
│2022-10-21T09:50:05.023281631Z 2022-10-21 09:50:05.022 UTC [24] DETAIL:  The postmaster has commanded this server process to roll back the current transacti│2022-10-21T09:50:05.023294975Z 2022-10-21 09:50:05.022 UTC [24] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
│2022-10-21T09:50:05.023366957Z 2022-10-21 09:50:05.023 UTC [116] WARNING:  terminating connection because of crash of another server process
│2022-10-21T09:50:05.023381977Z 2022-10-21 09:50:05.023 UTC [116] DETAIL:  The postmaster has commanded this server process to roll back the current transact│2022-10-21T09:50:05.023390828Z 2022-10-21 09:50:05.023 UTC [116] HINT:  In a moment you should be able to reconnect to the database and repeat your command.│2022-10-21T09:50:05.026736363Z 2022-10-21 09:50:05.026 UTC [1] LOG:  all server processes terminated; reinitializing
│2022-10-21T09:50:05.057004896Z 2022-10-21 09:50:05.056 UTC [119] LOG:  database system was interrupted; last known up at 2022-10-21 09:47:46 UTC
│2022-10-21T09:50:05.302102582Z 2022-10-21 09:50:05.301 UTC [119] LOG:  database system was not properly shut down; automatic recovery in progress
│2022-10-21T09:50:05.306296704Z 2022-10-21 09:50:05.306 UTC [119] LOG:  redo starts at 23/5B64C60
│2022-10-21T09:50:05.310843290Z 2022-10-21 09:50:05.310 UTC [119] LOG:  invalid record length at 23/5B8FDB0: wanted 24, got 0
│2022-10-21T09:50:05.310873851Z 2022-10-21 09:50:05.310 UTC [119] LOG:  redo done at 23/5B8FD88
│2022-10-21T09:50:05.335013559Z 2022-10-21 09:50:05.334 UTC [1] LOG:  database system is ready to accept connections
e-caste commented 1 year ago

Same error after re-trying to update with Postgres 13.8.
App logs:

2022-10-21T11:49:38.668192049Z Update app spreed from App Store
│2022-10-21T11:49:59.503907648Z Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 server closed the connection u│2022-10-21T11:49:59.503977164Z     This probably means the server terminated abnormally
│2022-10-21T11:49:59.503998654Z     before or while processing the request.
│2022-10-21T11:49:59.538856627Z Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
│2022-10-21T11:49:59.572542315Z Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
│2022-10-21T11:49:59.650960024Z Repair error: An exception occurred while executing a query: An exception occurred while executing a query: SQLSTATE[HY000]:
│2022-10-21T11:49:59.684907968Z Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
│2022-10-21T11:49:59.730152182Z Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
│2022-10-21T11:49:59.764913272Z Repair warning: Unable to clear the frontend cache
│2022-10-21T11:49:59.786299160Z Repair warning: Unable to clear the avatar cache
│2022-10-21T11:49:59.820265387Z Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
│2022-10-21T11:49:59.853666885Z Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
│2022-10-21T11:49:59.898965187Z Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
│2022-10-21T11:50:00.012358502Z Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
│2022-10-21T11:50:00.068406704Z Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
│2022-10-21T11:50:00.102442677Z Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
│2022-10-21T11:50:00.136053567Z Repair error: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server
│2022-10-21T11:50:00.158475790Z Doctrine\DBAL\Exception\DriverException: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no
│2022-10-21T11:50:00.169601378Z Update failed
│2022-10-21T11:50:00.181029312Z Maintenance mode is kept active

Postgres logs:

2022-10-21T11:48:17.769572342Z
│2022-10-21T11:48:17.769621828Z PostgreSQL Database directory appears to contain a database; Skipping initialization
│2022-10-21T11:48:17.769631494Z
│2022-10-21T11:48:17.826443500Z 2022-10-21 11:48:17.826 UTC [1] LOG:  starting PostgreSQL 13.8 on x86_64-pc-linux-musl, compiled by gcc (Alpine 11.2.1_git202│2022-10-21T11:48:17.828388120Z 2022-10-21 11:48:17.826 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
│2022-10-21T11:48:17.828442306Z 2022-10-21 11:48:17.826 UTC [1] LOG:  listening on IPv6 address "::", port 5432
│2022-10-21T11:48:17.832174984Z 2022-10-21 11:48:17.831 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
│2022-10-21T11:48:17.842489921Z 2022-10-21 11:48:17.842 UTC [21] LOG:  database system was shut down at 2022-10-21 11:46:33 UTC
│2022-10-21T11:48:17.850119851Z 2022-10-21 11:48:17.849 UTC [1] LOG:  database system is ready to accept connections
│2022-10-21T11:49:59.477480477Z 2022-10-21 11:49:59.477 UTC [1] LOG:  server process (PID 29) was terminated by signal 11: Segmentation fault
│2022-10-21T11:49:59.477533926Z 2022-10-21 11:49:59.477 UTC [1] DETAIL:  Failed process was running: UPDATE "oc_filecache" SET "mimetype" = $1 WHERE ("mimety│2022-10-21T11:49:59.477545020Z 2022-10-21 11:49:59.477 UTC [1] LOG:  terminating any other active server processes
│2022-10-21T11:49:59.478471298Z 2022-10-21 11:49:59.477 UTC [25] WARNING:  terminating connection because of crash of another server process
│2022-10-21T11:49:59.478511875Z 2022-10-21 11:49:59.477 UTC [25] DETAIL:  The postmaster has commanded this server process to roll back the current transacti│2022-10-21T11:49:59.478530272Z 2022-10-21 11:49:59.477 UTC [25] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
│2022-10-21T11:49:59.480429519Z 2022-10-21 11:49:59.480 UTC [1] LOG:  all server processes terminated; reinitializing
│2022-10-21T11:49:59.525686214Z 2022-10-21 11:49:59.525 UTC [32] LOG:  database system was interrupted; last known up at 2022-10-21 11:48:17 UTC
│2022-10-21T11:49:59.805990905Z 2022-10-21 11:49:59.805 UTC [32] LOG:  database system was not properly shut down; automatic recovery in progress
│2022-10-21T11:49:59.809855236Z 2022-10-21 11:49:59.809 UTC [32] LOG:  redo starts at 23/6DEBF20
│2022-10-21T11:49:59.829065376Z 2022-10-21 11:49:59.828 UTC [32] LOG:  invalid record length at 23/6F14CD8: wanted 24, got 0
│2022-10-21T11:49:59.829085369Z 2022-10-21 11:49:59.828 UTC [32] LOG:  redo done at 23/6F14CB0
│2022-10-21T11:50:00.128879316Z 2022-10-21 11:50:00.128 UTC [1] LOG:  database system is ready to accept connections
e-caste commented 1 year ago

Solved! 🍾

My debug process:

  1. understand what failed (the Repair step of the Upgrade process)
  2. understand why it failed (the Postgres DB throwing some error)
  3. understand that the update from Nextcloud 24 to 25 wasn't the cause, but it was the only thing forcing me to run a Repair process which otherwise I wouldn't have run
  4. understand why the DB failed (some entries in the oc_filecache table were corrupted)
  5. understand how to fix Postgres data corruption (spoiler: you can't. But if you have balls of steel you can save all data that is not corrupted by dding part of an internal DB file in the right place, see this brilliant guide on how to do so, and remember to VACUUM full oc_filecache at the end (you can use a Postgres client, I recommend Postico if you're on macOS)
  6. run occ maintenance:mimetype:update-db and occ maintenance:mimetype:update-db --repair-filecache to check that these commands succeed

Thanks @szaimen for helping me in the first debug steps. Now closing this issue.

EDIT: update to 25 finished correctly!