YunoHost-Apps / pleroma_ynh

Pleroma package for YunoHost: A free, federated social networking server built on open protocols.
https://pleroma.social/
GNU Affero General Public License v3.0
30 stars 11 forks source link

Failed upgrade #217

Closed MCMic closed 1 year ago

MCMic commented 1 year ago

Describe the bug

Attempting to upgrade from 2.4.3~ynh2 to 2.4.4~ynh1 it failed to upgrade. Now it’s not fetching new posts anymore, I see 500 errors notifications all over the UI

ended_at: 2022-11-18 12:46:22.600068
env:
  BASH_XTRACEFD: '7'
  NO_BACKUP_UPGRADE: '0'
  YNH_APP_BASEDIR: /var/cache/yunohost/app_tmp_work_dirs/app_lobfor4d
  YNH_APP_CURRENT_VERSION: 2.4.3~ynh2
  YNH_APP_ID: pleroma
  YNH_APP_INSTANCE_NAME: pleroma
  YNH_APP_INSTANCE_NUMBER: '1'
  YNH_APP_MANIFEST_VERSION: 2.4.4~ynh1
  YNH_APP_UPGRADE_TYPE: UPGRADE_FULL
  YNH_ARCH: amd64
  YNH_CWD: /var/cache/yunohost/app_tmp_work_dirs/app_lobfor4d/scripts
  YNH_INTERFACE: cli
  YNH_STDRETURN: /tmp/tmpj8yw8ctd/stdreturn
error: "Une erreur s'est produite durant l'ex\xE9cution du script de mise \xE0 niveau\
  \ de l'application"
interface: cli
operation: app_upgrade
parent: null
related_to:
- - app
  - pleroma
started_at: 2022-11-18 11:25:22.677399
success: false
yunohost_version: 11.0.10.2

Logs

https://paste.yunohost.org/raw/beludopaki

tituspijean commented 1 year ago

Relevant part:

2022-11-18 12:51:00,005: DEBUG - 12:50:59.843 [error] Could not create schema migrations table. This error usually happens due to the following:
2022-11-18 12:51:00,005: DEBUG - 
2022-11-18 12:51:00,006: DEBUG -   * The database does not exist
2022-11-18 12:51:00,006: DEBUG -   * The "schema_migrations" table, which Ecto uses for managing
2022-11-18 12:51:00,006: DEBUG -     migrations, was defined by another library
2022-11-18 12:51:00,007: DEBUG -   * There is a deadlock while migrating (such as using concurrent
2022-11-18 12:51:00,007: DEBUG -     indexes with a migration_lock)
2022-11-18 12:51:00,007: DEBUG - 
2022-11-18 12:51:00,008: DEBUG - To fix the first issue, run "mix ecto.create".
2022-11-18 12:51:00,008: DEBUG - 
2022-11-18 12:51:00,008: DEBUG - To address the second, you can run "mix ecto.drop" followed by
2022-11-18 12:51:00,009: DEBUG - "mix ecto.create". Alternatively you may configure Ecto to use
2022-11-18 12:51:00,009: DEBUG - another table and/or repository for managing migrations:
2022-11-18 12:51:00,009: DEBUG - 
2022-11-18 12:51:00,010: DEBUG -     config :pleroma, Pleroma.Repo,
2022-11-18 12:51:00,010: DEBUG -       migration_source: "some_other_table_for_schema_migrations",
2022-11-18 12:51:00,010: DEBUG -       migration_repo: AnotherRepoForSchemaMigrations
2022-11-18 12:51:00,011: DEBUG - 
2022-11-18 12:51:00,011: DEBUG - The full error report is shown below.
2022-11-18 12:51:00,012: DEBUG - 
2022-11-18 12:51:01,338: DEBUG - ** (DBConnection.ConnectionError) connection not available and request was dropped from queue after 2520ms. This means requests are coming in and your connection pool cannot serve them fast enough. You can address this by:
2022-11-18 12:51:01,339: DEBUG - 
2022-11-18 12:51:01,339: DEBUG -   1. Ensuring your database is available and that you can connect to it
2022-11-18 12:51:01,340: DEBUG -   2. Tracking down slow queries and making sure they are running fast enough
2022-11-18 12:51:01,340: DEBUG -   3. Increasing the pool_size (albeit it increases resource consumption)
2022-11-18 12:51:01,341: DEBUG -   4. Allowing requests to wait longer by increasing :queue_target and :queue_interval
MCMic commented 1 year ago

So, should I simply attempt to upgrade again?

MCMic commented 1 year ago

So restarting pleroma brings the same 500 errors. It seems the database is in a bad state, but it’s not clear for me if it’s corrupted or down or what. Is it the postgres database that it fails to query or is that something else? I do not know what Ecto is.

MCMic commented 1 year ago

Rebooting did not help, pleroma is still erroring out. In the logs there are errors as well in the restore I think. WARNING - sh: 0: getcwd() failed: No such file or directoryappears a lot, not sure what to do with that.

I do not know if trying to restore again may fix something? I fear if I start the upgrade again it may erase the backup from first try?

MCMic commented 1 year ago

I removed the app, then restored again from the backup, no errors in the restore operation, yet still the same behavior of the application afterwards :-(

MCMic commented 1 year ago

I have lots of different errors in the logs, not clear which ones are the root cause:

nov. 21 21:24:44 <host> pleroma[25490]: ** (Ecto.ConstraintError) constraint error when attempting to insert struct:
nov. 21 21:24:44 <host> pleroma[25490]:     * users_ap_id_index (unique_constraint)
nov. 21 21:24:44 <host> pleroma[25490]: If you would like to stop this constraint violation from raising an
nov. 21 21:24:44 <host> pleroma[25490]: exception and instead add it as an error to your changeset, please
nov. 21 21:24:44 <host> pleroma[25490]: call `unique_constraint/3` on your changeset with the constraint
nov. 21 21:24:44 <host> pleroma[25490]: `:name` as an option.
nov. 21 21:24:44 <host> pleroma[25490]: The changeset defined the following constraints:
nov. 21 21:24:44 <host> pleroma[25490]:     * users_nickname_index (unique_constraint)
nov. 21 21:24:44 <host> pleroma[25490]:     (ecto 3.6.2) lib/ecto/repo/schema.ex:764: anonymous fn/4 in Ecto.Repo.Schema.constraints_to_errors/3
nov. 21 21:24:44 <host> pleroma[25490]:     (elixir 1.10.4) lib/enum.ex:1396: Enum."-map/2-lists^map/1-0-"/2
nov. 21 21:24:44 <host> pleroma[25490]:     (ecto 3.6.2) lib/ecto/repo/schema.ex:749: Ecto.Repo.Schema.constraints_to_errors/3
nov. 21 21:24:44 <host> pleroma[25490]:     (ecto 3.6.2) lib/ecto/repo/schema.ex:730: Ecto.Repo.Schema.apply/4
nov. 21 21:24:44 <host> pleroma[25490]:     (ecto 3.6.2) lib/ecto/repo/schema.ex:350: anonymous fn/15 in Ecto.Repo.Schema.do_insert/4
nov. 21 21:24:44 <host> pleroma[25490]:     (pleroma 2.4.3) lib/pleroma/web/activity_pub/activity_pub.ex:1665: Pleroma.Web.ActivityPub.ActivityPub.make_user_from_ap_id/1
nov. 21 21:24:44 <host> pleroma[25490]:     (pleroma 2.4.3) lib/pleroma/user.ex:1884: Pleroma.User.get_or_fetch_by_ap_id/1
nov. 21 21:24:44 <host> pleroma[25490]:     (pleroma 2.4.3) lib/pleroma/web/activity_pub/transmogrifier.ex:462: Pleroma.Web.ActivityPub.Transmogrifier.handle_incoming/2
nov. 21 21:24:44 <host> pleroma[25490]: Function: #Function<8.97514948/0 in Pleroma.Web.ActivityPub.ActivityPub.make_user_from_ap_id/1>
nov. 21 21:24:44 <host> pleroma[25490]:     Args: []
nov. 21 21:24:46 <host> pleroma[25490]: 21:24:46.927 request_id=FymzTa6NtmLHxEoAANDh [info] Sent 200 in 3791ms
nov. 21 21:24:50 <host> pleroma[25490]: 21:24:50.363 [info] GET /api/v1/pleroma/chats
nov. 21 21:24:50 <host> pleroma[25490]: 21:24:50.385 request_id=FymzT1133dk_KL4AAMTS [info] Sent 200 in 22ms
nov. 21 21:24:51 <host> pleroma[25490]: 21:24:51.256 [info] POST /inbox
nov. 21 21:24:51 <host> pleroma[25490]: 21:24:51.752 request_id=FymzT5KT6299vx8AANEB [info] Sent 200 in 496ms
nov. 21 21:24:51 <host> pleroma[25490]: 21:24:51.805 [info] POST /inbox
nov. 21 21:24:52 <host> pleroma[25490]: 21:24:52.536 [error] Postgrex.Protocol (#PID<0.6743.0>) disconnected: ** (DBConnection.ConnectionError) client #PID<0.30836.0> timed out because it queued and checked out the conne>
nov. 21 21:24:52 <host> pleroma[25490]: #PID<0.30836.0> was at location:
nov. 21 21:24:52 <host> pleroma[25490]:     :prim_inet.recv0/3
nov. 21 21:24:52 <host> pleroma[25490]:     (postgrex 0.15.9) lib/postgrex/protocol.ex:2972: Postgrex.Protocol.msg_recv/4
nov. 21 21:24:52 <host> pleroma[25490]:     (postgrex 0.15.9) lib/postgrex/protocol.ex:2014: Postgrex.Protocol.recv_bind/3
nov. 21 21:24:52 <host> pleroma[25490]:     (postgrex 0.15.9) lib/postgrex/protocol.ex:1869: Postgrex.Protocol.bind_execute_close/4
nov. 21 21:24:52 <host> pleroma[25490]:     (db_connection 2.4.0) lib/db_connection/holder.ex:325: DBConnection.Holder.holder_apply/4
nov. 21 21:24:52 <host> pleroma[25490]:     (db_connection 2.4.0) lib/db_connection.ex:1314: DBConnection.run_execute/5
nov. 21 21:24:52 <host> pleroma[25490]:     (db_connection 2.4.0) lib/db_connection.ex:1409: DBConnection.run/6
nov. 21 21:24:52 <host> pleroma[25490]:     (db_connection 2.4.0) lib/db_connection.ex:574: DBConnection.parsed_prepare_execute/5
nov. 21 21:24:52 <host> pleroma[25490]: 21:24:52.542 request_id=FymzTFb05Gp3XrkAANCh [error] Internal server error: %DBConnection.ConnectionError{message: "tcp recv: closed (the connection was closed by the pool, possibl>
nov. 21 21:24:52 <host> pleroma[25490]: 21:24:52.544 request_id=FymzTFb05Gp3XrkAANCh [info] Sent 500 in 15173ms
nov. 21 21:24:52 <host> pleroma[25490]: 21:24:52.550 [error] #PID<0.30836.0> running Pleroma.Web.Endpoint (connection #PID<0.30835.0>, stream id 1) terminated
nov. 21 21:24:52 <host> pleroma[25490]: Server: <domain>:80 (http)
nov. 21 21:24:52 <host> pleroma[25490]: Request: GET /api/v1/timelines/home?with_muted=true&limit=20
nov. 21 21:24:52 <host> pleroma[25490]: ** (exit) an exception was raised:
nov. 21 21:24:52 <host> pleroma[25490]:     ** (DBConnection.ConnectionError) tcp recv: closed (the connection was closed by the pool, possibly due to a timeout or because the pool has been terminated)
nov. 21 21:24:52 <host> pleroma[25490]:         (ecto_sql 3.6.2) lib/ecto/adapters/sql.ex:760: Ecto.Adapters.SQL.raise_sql_call_error/1
nov. 21 21:24:52 <host> pleroma[25490]:         (ecto_sql 3.6.2) lib/ecto/adapters/sql.ex:693: Ecto.Adapters.SQL.execute/5
nov. 21 21:24:52 <host> pleroma[25490]:         (ecto 3.6.2) lib/ecto/repo/queryable.ex:224: Ecto.Repo.Queryable.execute/4
nov. 21 21:24:52 <host> pleroma[25490]:         (ecto 3.6.2) lib/ecto/repo/queryable.ex:19: Ecto.Repo.Queryable.all/3
nov. 21 21:24:52 <host> pleroma[25490]:         (pleroma 2.4.3) lib/pleroma/pagination.ex:40: Pleroma.Pagination.fetch_paginated/4
nov. 21 21:24:52 <host> pleroma[25490]:         (pleroma 2.4.3) lib/pleroma/web/activity_pub/activity_pub.ex:484: Pleroma.Web.ActivityPub.ActivityPub.fetch_activities/3
nov. 21 21:24:52 <host> pleroma[25490]:         (pleroma 2.4.3) lib/pleroma/web/mastodon_api/controllers/timeline_controller.ex:56: Pleroma.Web.MastodonAPI.TimelineController.home/2
nov. 21 21:24:52 <host> pleroma[25490]:         (pleroma 2.4.3) lib/pleroma/web/mastodon_api/controllers/timeline_controller.ex:5: Pleroma.Web.MastodonAPI.TimelineController.action/2
nov. 21 21:24:53 <host> pleroma[25490]: 21:24:53.490 request_id=FymzT7OKu_Gjex4AANER [error] Internal server error: %FunctionClauseError{args: nil, arity: 1, clauses: nil, function: :"-pin_data_from_featured_collection/1>
nov. 21 21:24:53 <host> pleroma[25490]: 21:24:53.490 request_id=FymzT7OKu_Gjex4AANER [info] Sent 500 in 1685ms
nov. 21 21:24:54 <host> pleroma[25490]: 21:24:53.496 [error] #PID<0.30905.0> running Pleroma.Web.Endpoint (connection #PID<0.30904.0>, stream id 1) terminated
MCMic commented 1 year ago

Ok, I managed to solve most problems by writing and running the following script, which imports data only after using pleroma_ctl to create the database: https://paste.yunohost.org/gowerifaca.bash

Not sure why it works and I still have some 500 errors, but at least most features are working again, I can follow, read and write toots.

I took the opportunity to learn how to use pleroma_ctl and run ./bin/pleroma_ctl database prune_objects. Next backup should be smaller.

This should be added in the README in the admin tasks.

yalh76 commented 1 year ago

Ok, I managed to solve most problems by writing and running the following script, which imports data only after using pleroma_ctl to create the database: https://paste.yunohost.org/gowerifaca.bash

Not sure why it works and I still have some 500 errors, but at least most features are working again, I can follow, read and write toots.

I took the opportunity to learn how to use pleroma_ctl and run ./bin/pleroma_ctl database prune_objects. Next backup should be smaller.

This should be added in the README in the admin tasks.

Feel free to do a PR to add that command line ;)

MCMic commented 1 year ago

Attempted today to upgrade to 2.5.0, new failure: https://paste.yunohost.org/raw/daheqiqoho

MCMic commented 1 year ago

This is weirder and weirder. Pleroma actually updated fine and is on 2.5.0. But yunohost update fails now:

Info : Récupération des mises à jour disponibles pour les paquets du système...
Attention : sh: 0: getcwd() failed: No such file or directory
Erreur : Des erreurs se sont produites lors de la mise à jour du cache APT (gestionnaire de paquets Debian). Voici un extrait des lignes du fichier sources.list qui pourrait vous aider à identifier les lignes problématiques :
sources.list:deb http://ftp.be.debian.org/debian/ bullseye main
sources.list:deb-src http://ftp.be.debian.org/debian/ bullseye main
sources.list:deb http://security.debian.org/ bullseye-security main
sources.list:deb-src http://security.debian.org/ bullseye-security main
sources.list:deb http://ftp.be.debian.org/debian/ bullseye-updates main
sources.list:deb-src http://ftp.be.debian.org/debian/ bullseye-updates main
sources.list.d/extra_php_version.list:deb https://packages.sury.org/php/ bullseye main
sources.list.d/yunohost.list:deb http://forge.yunohost.org/debian/ bullseye stable
sources.list.d/docker.list:deb [arch=amd64] https://download.docker.com/linux/debian bullseye edge
Attention : Failed to parse this line : sh: 0: getcwd() failed: No such file or directory
Info : Mise à jour du catalogue d'applications...
Succès ! Le catalogue des applications a été mis à jour !
Erreur : Command '['git', 'ls-remote', '--symref', 'https://github.com/YunoHost-Apps/pleroma_ynh', 'HEAD']' returned non-zero exit status 128.
Erreur : Impossible de récupérer les fichiers sources, l'URL est-elle correcte ?

But apt update work fine.

MCMic commented 1 year ago

Oh ok that was because I was inside the pleroma folder which was modified by the update. Ok so now yunohost still think I have an out of date pleroma while actually it’s already 2.5.0 :-/

MCMic commented 1 year ago

So that was the problem this time, one must not try to update pleroma from inside its folder.

It would be useful if this was detected and the upgrade was stopped in this case.