I used the following image: docker.io/matrixdotorg/dendrite-polylith@sha256:43816ad9050e9cb18b8e5c0cd41f5a779cec295fb122f164b4461f998ec5d344
Dendrite was split over 6 nodes.
Description
What
Databases started spewing errors about
Who
Entire homeserver.
How
Databases emitting invalid byte sequence for encoding "UTF8": 0x00 (and similar) errors, and components writing ERROR: relation "?" does not exist at character ?
When
I'm not 100% sure. Although I think it had already begun happening prior, it started becoming a major issue after I tried to set my avatar to an animated one (webp) and the server crashed (a user of the homeserver wrote to me "the server bricked after you changed your profile picture", and I experienced the same). After the components that crashed came back up again, I checked the logs and gradually the Postgres instances began throwing errors as described above. I'm not sure which exact components crashed, but considering we were seeing the "network lost" messages in our clients, probably the client api? (and more?)
Steps to reproduce
I am unsure how to reproduce this at the moment. I am filing this issue as I was requested to, so it does not currently have reproduction steps. I understand if the issue is closed; this is just for reference :)
I am however suspicious of database locale. With Postgres locale set to en_US, the errors occured, but upon changing to en_GB (same as dendrite.matrix.org) the errors stopped. It is however completely possible that this is unrelated, but I don't have enough familiarity with the internals of Postgres nor Dendrite to say for certain. :)
I first noticed everything with the keyserver database:
(apologies for it being images. I didn't copy the relevant lines to a text document before troubleshooting the issue, and no longer have access to the raw logs)
Right, this column cannot be TEXT as it doesn't allow null bytes. It needs to be BYTEA. Same applies for the event_json table and anywhere else where we dump user input.
Background information
go version
: ` n/aI used the following image:
docker.io/matrixdotorg/dendrite-polylith@sha256:43816ad9050e9cb18b8e5c0cd41f5a779cec295fb122f164b4461f998ec5d344
Dendrite was split over 6 nodes.Description
What
Databases started spewing errors about
Who
Entire homeserver.
How
Databases emitting
invalid byte sequence for encoding "UTF8": 0x00
(and similar) errors, and components writingERROR: relation "?" does not exist at character ?
When
I'm not 100% sure. Although I think it had already begun happening prior, it started becoming a major issue after I tried to set my avatar to an animated one (webp) and the server crashed (a user of the homeserver wrote to me "the server bricked after you changed your profile picture", and I experienced the same). After the components that crashed came back up again, I checked the logs and gradually the Postgres instances began throwing errors as described above. I'm not sure which exact components crashed, but considering we were seeing the "network lost" messages in our clients, probably the client api? (and more?)
Steps to reproduce
I am unsure how to reproduce this at the moment. I am filing this issue as I was requested to, so it does not currently have reproduction steps. I understand if the issue is closed; this is just for reference :)
I am however suspicious of database locale. With Postgres locale set to en_US, the errors occured, but upon changing to
en_GB
(same asdendrite.matrix.org
) the errors stopped. It is however completely possible that this is unrelated, but I don't have enough familiarity with the internals of Postgres nor Dendrite to say for certain. :)I first noticed everything with the keyserver database:
Link to where I ask in #dendrite:matrix.org for help: https://matrix.to/#/!RcWPWcZrMeBxOGaalX:matrix.org/$mWcIimXJ33UNl2spsYu_-tBqMKjoPXQ4ro3_op3rP7s?via=matrix.org&via=envs.net&via=kanp.ai
(apologies for it being images. I didn't copy the relevant lines to a text document before troubleshooting the issue, and no longer have access to the raw logs)