Closed wpuckering closed 1 year ago
Where did you find that postgres version?
I've been using this container image for some of my self-hosted Postgres containers for a while now: https://hub.docker.com/layers/library/postgres/13.2-alpine/images/sha256-3335d0494b62ae52f0c18a1e4176a83991c9d3727fe67d8b1907b569de2f6175?context=explore
Admittedly I should get around to upgrading them all to a more recent version, but everything's been working fine for the 20 or so other services I host that use this image, so I wasn't in a rush. I didn't think the likelihood of hitting an incompatibility would be all that high, but maybe that's the case now?
Lemmy 0.18.2 runs fine with the database using that image. Does Lemmy 0.18.3 introduce some syntax for database migrations that's only valid on a higher version of Postgres?
That's not a postgres version we support. All our CI jobs would fail if we used older postgres versions. Use the backup and restore docs to do a postgres upgrade.
Okay, I'll upgrade and see if it resolves the issue. Thanks!
That's not a postgres version we support. All our CI jobs would fail if we used older postgres versions.
The supported "From Scratch" install instructions yielded PostgreSQL 14 on my Ubuntu 22.04.2 system, and those "From Scratch" installs were recommending Ubuntu 20.04 back at the start of June (not sure what version is default in the distro). There is no step in those instructions to use a specific version of PostgreSQL...
Further follow-up:
"Ubuntu 20.04 comes with Postgres 12 from it's universe repository." source: https://pgdash.io/blog/postgres-13-getting-started.html
On September 14, 2021 - "From Scratch" instructions were commited to the Lemmy project recommending Ubuntu 20.04, cite: https://github.com/LemmyNet/lemmy-docs/commit/9ca86abb39fc4ea15b062faa3daf246ec7965ba4
If one tests and follows these directions as they were published, it is my understanding you would be on PostgreSQL version 12
I believe it is important to keep this issue open. "Lemmy From Scratch" installations could be impacted by these upgrade issues.
We can't support from-scratch installs, just docker and ansible installs, and neither of those use the postgres version above.
I added an EMERGENCY pull request regarding documenting that you do not support from-scratch and do not warn people of the upgrade failures: https://github.com/LemmyNet/lemmy/issues/3756
We already have that warning: https://join-lemmy.org/docs/administration/administration.html
We already have that warning
Can you be precise, exact quote or GitHub line-link to precise line of content, exact warning that upgrades will crash in 0.18.3 that I added?
Just chiming in, I took a backup of my Postgres 13.2 database, and restored it into Postgres 15.3. It's still having issues running the migrations, it fails at the same point but yields a more accurate error:
thread 'main' panicked at 'Couldn't run DB Migrations: Failed to run 2023-07-08-101154_fix_soft_delete_aggregates with: relation "comment" does not exist', crates/db_schema/src/utils.rs:221:25
Might possibly have something to do with this? https://github.com/LemmyNet/lemmy/issues/3501
Looks like somehow my comment
table is missing. I didn't do anything on my part which should have caused it to be deleted, but that appears to be the root cause.
I'm back in business. Something went very wrong with the migrations in 0.18.3, but I wasn't able to pinpoint what it was (I didn't try too hard to be honest, I just wanted to get up and running again).
Here's what I did that put me back into a good state in the end:
I don't know if that will help anyone else or not. Something seems janky with the migration scripts, maybe only if you had been running with an older version of Postgres (like 13) since before 0.18.3. Just glad I got up to the new version. Will definitely exercise more caution when upgrading from here on.
Were you by any chance running an rc, or main branch? I can't imagine why the comment table would go missing like that.
Never ran any release candidates, always stuck to the stable container images.
Can you be precise, exact quote or GitHub line-link to precise line of content, exact warning that upgrades will crash in 0.18.3 that I added?
Thank you for responding. Has Lemmy.ml for the past 60 days been an example of a "supported" system? As the constant crashes caused by issue #2910 have confused me every single day since June 4. And why are you doing everything you can to avoid people opening PostgreSQL related topics?
I am BEYOND bewildered about what a supported system looks like. I know I have brain damage, which is not a joke, but I am having to develop advanced social theories to explain why issue #2910 was so ignored when it could be fixed with a simple SQL removal of the TRIGGER? PostgreSQL is the core engine of lemmy, it holds everything, I just don't understand why the rush to close issues immediately after a release instead of doing due diligence to be sure nobody else has the problem (which they won't likely find on a CLOSED support issue).
"The buck stops before PostgreSQL", support doesn't include #2910?
If this is all some elaborate social hazing, I think you guys are masters of it! #2910 since June 4! I personally created a community on Lemmy June 13 called "!lemmypperformance@lemmy.ml" in response to it. If it is all social hazing, you sure got me until about July 15.
Is lemmy.ca a "supported system"? As they personally cloned their database and ran PostgreSQL EXPLAINS to find their own crashes on Saturday and Sunday. I had a pull request waiting for you Monday morning! "Ohh PostgreSQL is icky, close any reports and avoid touching #2910" seems to be the very problem - with constant crashes and overloads. I am beyond bewildered and again, having to develop sophisticated social theories to explain why you ignored #2910 when it is so clearly written and spelled out as a server-crash issue. "Supported"? I can't believe how much you RUSHED to close #3756 and avoid anyone seeing a PostgreSQL topic. Bewildered, dumbfounded, advanced social theories is all I have left! Hazing of some extreme order! June 4, JUNE FOUR, #2910 SUPPORTED!
Explain your social behavior with issue #2910 - not supported?
Is lemmy.ml your server, or not? Do you not have the ability to fix GitHub issue #2910 with the simple REMOVAL of a TRIGGER? June 4.... 11 days later, on June 15 I am submitting lemmy.ml support tickets about the crashes related to #2910 PostgreSQL crashes:
Go ahead, RED CIRCLE what supported means in Lemmy project. Is lemmy.ml a SUPPORTED SERVER running DOCKER? Is issue #2910 going to be ignored the rest of your life, June 4 2023?
More from June, over a MONTH ago, about your personal labor to AVOID making PostgreSQL the central focus of Lemmy development TO FIX CRASHES of LEMMY.ML and OTHER SERVERS and support: https://github.com/LemmyNet/lemmy-docs/issues/233
Why in GOD's NAME, did you close this issue #3756 as fast as you possibly could when you never tested Lemmy 0.18.3 on "From Scratch" installs? WHY? NOT RHETORICAL!
You make a RELEASE on Friday, when you obviously do not work on Saturday and Sunday, ignoring just this recent Sunday pull-request with an EASY and FAST fix to server-crashing crisis of the entire Lemmy network of servers... (shall I bring up how you edited out the word "Emergency" first thing Monday when you started working?)
Was June 4, not June 1, July, Not June 3 but June 4 long enough to leave this issue ignored because it was PostgreSQL and not your Rust coding focus? Was #2910 not CLEAR ENOUGH in ENGLISH:
AGAIN: ARE YOU TROLLING THE ENTIRE REDDIT COMMUNITY WITH lemmy.ml CRASHING every day since May 25 that I have documented? And how RAPIDLY and "Johnny ON THE SPOT" you were closing out PostgreSQL related upgrade reports on fresh RELEASE of 0.18.3 - issue #3756?
Am I being trolled with the illusion that "PostgreSQL isn't supported, https://lemmy.ml/ crashing every single hour of every single day since May 25, 2023 because of PostgreSQL overload in lemmy_server logic isn't what we support! Again, again: is lemmy.ml an EXAMPLE of your "supported" servers? ignoring #2910 since June 4?
I am having to research and order additional learning material (books) about social hazing practices. It is all far beyond my personal experience and understanding what you think you are dong closing this PostgreSQL issue #3756 as rapidly as possible while avoiding the truth of PostgreSQL Issue #2910
I personally suffer from extensive brain damage, and I am in the final time of my life. I have been hospitalized in mental institutions several times in recent years with brain damage problems.
So I have learned a lot about psychological behavior changes. And I know PostgreSQL seems to trigger total avoidance in the Lemmy project leadership. So I am spending my weekend reviewing, once again, how you could close issue #3756 so rapidly, while ignoring issue #2910 while your official project lemmy.ml server has been crashing every-hour of every-day since May 25 that I have personally documented it.
Let's use Lemmy community: https://lemmy.ml/post/2561833
I KNOW from over 35 years of production application development with SQL that issue #2910 was fixable in 30 minutes of labor, remove the TRIGGER with an SQL statement. So I can only get into advance social theories of why since June 4 you have been behaving this way about PostgreSQL topics.
Do you all death-wish brain-damaged humans?
Ok, I'm sorry about my reaction to this, I have to face up that the Lemmy community has values that I overlooked.
I’ve had the wrong attitude about the server needing to be reliable, data loss importance. It was bewildering to me, beyond my imagination, that social things worked like this in modern day. But I need to face up that Elon Musk does it, and I don't get as bewildered as to it. Social things can be this way, and I have too much brain damage and reliving personal memories and experience with PostgreSQL, MaxDB and DB/2 that trained me to treat data differently.
I really failed to be open minded that the community thinks the crashes are fine. I'm sorry, and I guess baggage from past jobs and server responsibilities
When I step back and study the history of Microsoft vs. Apple. the “open hardware” approach of Microsoft proved to be abandoned. Apple always had the best loyalty by NOT offering hardware choice to their customers. and Linux, which offered the most hardware choices of all, was even less popular. Apple knew people didn’t like to pay money for software, so they were an operating system company who packaged it with hardware.
If you look at how Hollywood productions are run in terms of expenses for sets, clothes, shooting locations - it can be millions of dollars for 90 seconds of a movie. Because it’s in the sphere of social media, films are social.
It was hard for me to come to terms with. I drank for the first time since New Year I was in so much mental pain and anguish, which has more to do with deteriorating hardware than anything. I tried to get words out to convey my PostgreSQL experiences running with data in production, and I just didn't see that with social sites the rules are social and not about data integrity.
I should have stuck to topics like ChatGPT, Neil Postman, Marshall McLuhan. Sorry I'm such a mess and had all the wrong attitude about a project like this.
This is why, Sunday night
I compiled from source in order to upgrade from 0.18.2 to 0.18.4. I am following the "Install from scratch" guide (I know it is not the official method of lemmy install, however, I am not familiar with docker, and like running programs directly on my debian 12 system).
After compilation, I tried restarting the lemmy systemd service, however it fails to run. In the journalctl, I see the following error message:
lemmy_server[17631]: thread 'main' panicked at 'Couldn't run DB Migrations: Failed to run 2023-07-08-101154_fix_soft_delete_aggregates with: syntax error at or near "trigger"', crates/db_schema/src/utils.rs:221:25
Is there a solution to that? I have lost my lemmy instance because of that.
Do you have a backup from the 0.18.2 before upgrade? PostgresQL restore is amazingly fast, it does all the INDEX and constraints at the end. What is your version of PostgreSQL, 15.4?
Here is the source file that is referencing: https://github.com/LemmyNet/lemmy/blob/main/migrations/2023-07-08-101154_fix_soft_delete_aggregates/up.sql
I have a backup before the upgrade. However, in my server, I have postgresql 13 and 15. I had the server running debian 11, and then I have dist-upgrade'd into debian 12, and along with it, came postgresql 15.
However, the psql db that lemmy is running is still 13 (up until lemmy v0.18.2).
To get things more trickier, I have been running a matrix dendrite server on the same VPS, on the psql 13 database (under a different database username, of course).
So, how would a database restore work in this case? I would like the dendrite database to be left unscathed and only migrate the lemmy db data from the psql 13 database to the psql database 15.
yha, the issue you are responding to I think is validation that older PostgreSQL stopped working in recent version.... you have to be on 15.
PostgreSQL pretty much has you dump the entire database to a text file and re-import it, the backup/restore process is very fast - even if you have 400,000 posts and equal number of comments. From what I've seen, you should be able to backup from 13, restore to 15 - change the parameters to point to the new server - and start lemmy_server and it should get past the error you are having.
@RocketDerp are you in the Lemmy Matrix support rooms? If so, can you DM me, or ping me on one of the Lemmy Upgrade or Lemmy Install matrix rooms? My matrix username is @k4r4b3y:karapara.net
I would like to learn more about how I can do the lemmy database migration from psql 13 to 15. I have never used psql (or any other SQL software before).
Also, afaik, the lemmy is by default using the psql 15 on my system, however, the debian psql 15 have inherited the previous existing psql 13 database and running it. So, the software psql 15 is running the db psql 13 on my debian 12. How do I fix these things?
are you in the Lemmy Matrix support rooms?
I'll be honest, I've been involved in development since the 1980's and I've always preferred search-engine facing content like Reddit as opposed to black-holes where nothing gets in search engines. Be it IRC chat or Discord. I know there is a long tradition of open source with IRC - but every time I ever visited those rooms on obscure projects nobody was around to answer. Not to say people always answer a Reddit or BBS posting.
I would like to learn more about how I can do the lemmy database migration from psql 13 to 15. I have never used psql (or any other SQL software before).
I did create a Lemmy community for this topic: https://lemmy.ml/c/lemmy_postgresql@bulletintree.com
I can try to make time to help you today... but Matrix Chat is just another platform to eat up time when I'm struggling to keep up with all the learning my self.
afaik, the lemmy is by default using the psql 15 on my system, however, the debian psql 15 have inherited the previous existing psql 13 database and running it. So, the software psql 15 is running the db psql 13 on my debian 12. How do I fix these things?
I assume these are running on different ports, tcp ports, I'd start there and talk specific port numbers.
@RocketDerp thank you for the detailed answer. I will create an account on the lemmy server you posted and detail my issue on the sub-lemmy you posted there. Is that ok?
I'd start there and talk specific port numbers.
I looked into those yesterday with phind.com AI assistant. And I realized that the ports 5432 and 5433 are being used. Afaik, 5432 belongs to the psql 13 and 5433 belongs to the psql 15.
I will create an account on the lemmy server you posted and detail my issue on the sub-lemmy you posted there. Is that ok?
Fine with me, or you can pick any other active instance and subscribe to it.
Afaik, 5432 belongs to the psql 13 and 5433 belongs to the psql 15.
I'll look for your post in Lemmy. I'm RoundSparrow on Lemmy.ml - and I'll discuss that further
Thanks. I should have posted a thread on there in an hour.
@RocketDerp , I posted here: https://bulletintree.com/post/3047685
ok, i'm going there now
I have the same issue. I'm using an EXTERNAL postgres database and for similar reasons to others, the schema update failed. For me, however, restoring from backup after upgrading to Postgresql 15 (which deletes the old db) didn't work and I still have the same error.
Now I can manually fix the up.sql/down.sql (and have tested this) but docker-compose up
still fails, presumably becuase the up/down scripts in the docker image are still "not fixed".
I did try docker-compose run lemmy /bin/sh
to try to poke around inside the container but I'm quite sure that's not the right way to do this. I know enough about docker to be dangerous, but that's it.
For anyone who lands here trying to run a fresh "install from scratch", here's what worked for me.
user@lemmy:~/lemmy$ pg_lsclusters
Ver Cluster Port Status Owner Data directory Log file
16 main 5432 online postgres /var/lib/postgresql/16/main /var/log/postgresql/postgresql-16-main.log
pg_dropcluster --stop 12 main
Requirements
Summary
When upgrading Lemmy BE from the 0.18.2 container image to the 0.18.3 container image, database migrations fail with this error:
thread 'main' panicked at 'Couldn't run DB Migrations: Failed to run 2023-07-08-101154_fix_soft_delete_aggregates with: syntax error at or near "trigger"', crates/db_schema/src/utils.rs:221:25
Reverted back to the 0.18.2 tag and my instance is running normally again without issue.
For the Postgres database I'm using
postgres:13.2-alpine
.Steps to Reproduce
Technical Details
thread 'main' panicked at 'Couldn't run DB Migrations: Failed to run 2023-07-08-101154_fix_soft_delete_aggregates with: syntax error at or near "trigger"', crates/db_schema/src/utils.rs:221:25 note: run with 'RUST_BACKTRACE=1' environment variable to display a backtrace
Version
BE 0.18.3
Lemmy Instance URL
No response