Frequent [TXN OPERROR] messages

c7hm4r commented 5 years ago

In our log files appear these lines multiple times (around 5 per day). Our synapse instance is very small. Until now, I have not yet recognized problems with our synapse server which seem to be related to that. However, the log level is WARN, so do I need to worry?

16:22:32 Starting db txn 'update_presence' from sentinel context
16:22:32 Starting db connection from sentinel context: metrics will be lost
16:23:27 [TXN OPERROR] {claim_e2e_one_time_keys-19645} could not serialize access due to concurrent update
16:23:27  0/5
16:23:27 [TXN OPERROR] {claim_e2e_one_time_keys-19645} could not serialize access due to concurrent update
16:23:27  1/5
16:23:27 [TXN OPERROR] {claim_e2e_one_time_keys-19648} could not serialize access due to concurrent update
16:23:27  0/5
16:23:27 [TXN OPERROR] {add_messages_to_device_inbox-19655} could not serialize access due to concurrent update
16:23:27  0/5
16:23:27 [TXN OPERROR] {add_messages_to_device_inbox-19654} could not serialize access due to concurrent update
16:23:27  0/5
16:23:27 [TXN OPERROR] {add_messages_to_device_inbox-19652} could not serialize access due to concurrent update
16:23:27  0/5
16:23:27 [TXN OPERROR] {add_messages_to_device_inbox-19655} could not serialize access due to concurrent update
16:23:27  1/5
16:23:32 Starting db txn 'update_presence' from sentinel context
16:23:32 Starting db connection from sentinel context: metrics will be lost

OS: Ubuntu 18.04 Version of package matrix-synapse-py3: 0.99.3+bionic1 DB is powered by PostgreSQL, and is also accessed by MXISD

richvdh commented 5 years ago

they aren't a problem, though I agree it's unhelpful that they happen so often.

airblag commented 4 years ago

I have the same issue. It happens alot with this message while trying to join a big federated room big room like matrix HQ (sorry my locale is in german but it translates to could not serialize access due to concurrent update ):

2020-03-24 18:30:14,520 - synapse.storage.database - 418 - WARNING - user_directory.notify_new_event-4457- [TXN OPERROR] {update_profile_in_user_dir-b23e} FEHLER:  kann Zugriff nicht serialisieren wegen gleichzeitiger Aktualisierung

In the logs of postgresql, I can see a lot of thoses at the same time (still sorry for the german locale):

2020-03-24 18:31:20.841 CET [14420] synapse_user@synapse ANWEISUNG:  INSERT INTO user_directory (user_id, display_name, avatar_url) VALUES ('@username:XxXxXx.com', 'User', 'mxc://XxXxXx.com/XxXxXxXxXxXxXxXxXxXxXxXxJU') ON CONFLICT (user_id) DO UPDATE SET display_name=EXCLUDED.display_name, avatar_url=EXCLUDED.avatar_url
2020-03-24 18:31:20.851 CET [14425] synapse_user@synapse FEHLER:  kann Zugriff nicht serialisieren wegen gleichzeitiger Aktualisierung

I got something like 150 of it while joining matrix HQ now, which makes the join take a few minutes. From what i found out in google, it might come from the way sql requests are done, but I'm a newbie with postgresql : https://www.postgresql.org/docs/10/transaction-iso.html

There might be some SQL optimiziation to do somewhere around there or is there some postgresql tuning that can solve this parallel updates ?

evilham commented 4 years ago

Good to read that these messages aren't an issue :-). I was a bit alarmed by them before :-D.

erikjohnston commented 4 years ago

The way forwards is probably to look at the transactions where we are seeing these cases and updating them so that they use autocommit mode, like we do in https://github.com/matrix-org/synapse/pull/8542.

A potential alternative is to convert some of the transactions to use READ COMMITTED isolation level (rather than REPEATABLE READ), but there is currently no support for that in the code base.

bradjones1 commented 1 year ago

Ran into this today while reviewing warning messages on my Synapse instance. In my case, infrequent but on things like read receipts.

matrix-org / synapse

Frequent [TXN OPERROR] messages #4993