spicywebau / craft-neo

A Matrix-like field type for Craft CMS that uses existing fields
Other
402 stars 63 forks source link

Database Rollback failed on Deadlock #637

Closed maxstrebel closed 1 year ago

maxstrebel commented 2 years ago

Description

We experience a lot of problems with our page builder right now. With backend users reporting scrambled blocks and lost blocks. You might want to have a closer look. But in general, since our Upgrade to Craft 4 the blocks will scramble (show up at other places on the page) or vanish.

Right now we remove "Typed Link Field" from our content builder, in the hope we can get rid of some of the problems. When we tested, we found a Deadlock (SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock;), that luckily was investigated by our webhost. I will quote quickly:

It looks as though there was a request processing which included an insert into lenz_linkfield. At the same time another process made a change to structureelements which resulted in the rollback. ​ I don't have too much experience of how NEO organises its database updates internally. But if it doesn't wrap its own DB updates in with Craft's own DB transactions during a Save I could see a possible scenario where Neo updates all of its models based on some edited Craft Entries, but then Craft rolls back its transaction (due to a deadlock in this case) ending up with the Neo blocks orphaned.

For me this looks like a possible explanation for what we are experiencing. Can you confirm, that this might be an issue?

Here is the deadlock:

------------------------
LATEST DETECTED DEADLOCK
------------------------
2022-08-11 11:31:05 0x7f570e784700
*** (1) TRANSACTION:
TRANSACTION 22379582, ACTIVE 2 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 81 lock struct(s), heap size 8312, 333 row lock(s), undo log entries 227
MariaDB thread id XXX, OS thread handle XXX, query id XXX XXX XXX_staging Update
INSERT INTO `lenz_linkfield` (`elementId`, `fieldId`, `siteId`, `linkedId`, `linkedSiteId`, `linkedTitle`, `linkedUrl`, `payload`, `type`, `dateCreated`, `dateUpdated`, `uid`) VALUES (XXX, XXX, 6, XXX, NULL, 'let\'s test it', 'https://XXX', '{\"customText\":\"Let\'s test it!\"}', 'entry', '2022-08-11 11:31:05', '2022-08-11 11:31:05', 'XXX')
*** WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 8292 page no 381 n bits 992 index XXX of table `XXX_staging`.`lenz_linkfield` trx id 22379582 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex XXX; asc supremum;;

*** CONFLICTING WITH:
RECORD LOCKS space id 8292 page no 381 n bits 992 index XXX of table `XXX_staging`.`lenz_linkfield` trx id 22379582 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex XXX; asc supremum;;

RECORD LOCKS space id 8292 page no 381 n bits 992 index XXX of table `XXX_staging`.`lenz_linkfield` trx id 22379585 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex XXX; asc supremum;;

*** (2) TRANSACTION:
TRANSACTION XXX, ACTIVE 2 sec updating or deleting
mysql tables in use 1, locked 1
LOCK WAIT 42 lock struct(s), heap size 3488, 34 row lock(s), undo log entries 32
MariaDB thread id XXX, OS thread handle XXX, query id XX XXX XXX_staging Updating
UPDATE `structureelements` SET `root`=1385814, `dateUpdated`='2022-08-11 11:31:04' WHERE `id`=1385814
*** WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 8344 page no 2238 n bits 152 index structureelements_root_idx of table `XXX_staging`.`structureelements` trx id XXX lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex XXX; asc supremum;;

*** CONFLICTING WITH:
RECORD LOCKS space id 8344 page no 2238 n bits 136 index structureelements_root_idx of table `XXX_staging`.`structureelements` trx id XXX lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex XXX; asc supremum;;

Record lock, heap no 44 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 4; hex XXX; asc   %I;;
 1: len 4; hex XXX; asc   %I;;

*** WE ROLL BACK TRANSACTION (0)

Any help would be hugely appreciated. Thank you for your time.

Steps to reproduce

It is hard to reproduce. We've had authors working in the backend for testing. After 4 hours of testing the authors provoked 4 deadlocks. But there was no indication why. There were entries with a lot of blocks, but there was also one entry affected with only one redactor and one link block.

  1. Work in the backend
  2. Have a lot of queries while working
  3. Experience Deadlock
  4. Whole content or partial content on page is wiped/scrambled

Other information

ttempleton commented 2 years ago

If a change to structureelements is involved, then that does sound like it could be happening when Neo builds (or deletes) a block structure, but that is using transactions (e.g. https://github.com/spicywebau/craft-neo/blob/3.3.0/src/services/Blocks.php#L264).

I'll have a look and see if there's anywhere we're missing the use of transactions - but does this still occur after you upgrade to Neo 3.3.0?

maxstrebel commented 2 years ago

Thank you for the quick fix on 3.3.0 and thanks for looking into it. I will keep you posted.

Unfortunately I'm very sure that 3.3.0 and the "scrambling" problem are unrelated. Our blocks would also show up on parents they are not allowed to be inside. So I was hoping to find a reasonable solution – as I cannot reproduce it, but it occurs to our authors so often that we have to lock our backend to all 35 users as also to remove Typed Link Field, as it was involved in the lock.

maxstrebel commented 2 years ago

For some context, the problem might also be, that there was no savepoint to rollback to? This is from the logs:

2022-08-11@13:31:05
php-staging
[+]
[warning] craft\db\mysql\Schema::rollBackSavepoint: Tried to roll back a savepoint, but it does not exist: SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT LEVEL4 does not exist
2022-08-11@13:31:05
php-staging
[+]
[warning] craft\db\mysql\Schema::rollBackSavepoint: Tried to roll back a savepoint, but it does not exist: SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT LEVEL3 does not exist
2022-08-11@13:31:05
php-staging
[+]
[warning] craft\db\mysql\Schema::rollBackSavepoint: Tried to roll back a savepoint, but it does not exist: SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT LEVEL2 does not exist
2022-08-11@13:31:05
php-staging
[+]
[warning] craft\db\mysql\Schema::rollBackSavepoint: Tried to roll back a savepoint, but it does not exist: SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT LEVEL1 does not exist
2022-08-11@13:31:05
php-staging
[+]
[error] yii\db\Exception: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction
maxstrebel commented 2 years ago

Hi Thomas, sorry I have to reopen this. We still experience a myriad of problems. I do not know if NEO is the alone cause of this. But as Neo behaves strangely (bocks deleting content, etc.) I place it here. At the moment nobody can work in the backend without breaking the content. I hope you can help us get to a stable state again:

Here's two deadlocks from this morning. We moved from a load balanced environment to a simple server. But we still get therese deadlocks.

2022-08-19 08:27:59 [web.ERROR] [yii\db\Exception] SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction The SQL being executed was: UPDATE `structureelements` SET `root`=1428914, `dateUpdated`=‘2022-08-19 08:27:58’ WHERE `id`=1428914

...

2022-08-19 08:32:46 [web.ERROR] [yii\db\Exception] SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction The SQL being executed was: UPDATE `structureelements` SET `root`=1429665, `dateUpdated`=‘2022-08-19 08:32:45’ WHERE `id`=1429665

On another note here's the additional problems we experience right now:

Currently we're in the process of trying to replicate it in a vanilla Craft install. Thats our current setup (we already removed Typed Link Field and LJ dynamic fields):

"born05/craft-assetusage": "3.0.0", "clubstudioltd/craft-asset-rev": "7.0.0", "craftcms/cms": "4.2.1.1", "craftcms/element-api": "3.0.1.1", "craftcms/feed-me": "5.0.4", "craftcms/redactor": "3.0.2", "mmikkel/retcon": "2.4.2", "nystudio107/craft-retour": "4.0.2", "nystudio107/craft-seomatic": "4.0.7", "putyourlightson/craft-blitz": "4.2.1", "spicyweb/craft-neo": "3.3.3", "verbb/formie": "2.0.6", "verbb/super-table": "3.0.0", "vlucas/phpdotenv": "^3.4.0", "wrav/oembed": "2.2.1"

maxstrebel commented 2 years ago

I can provoke the "A provisional draft already exists for element/user 2206/1." error in a vanilla Craft install with only Neo installed. But I have no clear way to reproduce it. When I can, then with a Neo child, that contains a Matrix field with two fields.

image

It also seems that installing SEOmatic or Typed Link Field make it appear more frequent.

ntmagda commented 2 years ago

Hi all,

I would like to add my observation into the context here: What we experienced with the Neo blocks is a weird behaviour with saving entries and validation of required fields

What happens:

Content disappears on save when validation warning has been previously triggered.

How to reproduce:

System specs:

I managed to reproduce it on the fresh craft install.

Craft Pro 4.2.1.1 Neo version: 3.3.3

To watch it in action:

https://user-images.githubusercontent.com/7404843/185645338-9a1ae4fc-faab-4ee0-ba58-42ee0d7060ea.mov

ttempleton commented 2 years ago

@maxstrebel When the provisional draft error occurs, do you get more than one POST at the same time to e.g. index.php?p=admin/actions/elements/save-draft? Just wondering if there's anything happening (whether caused by Neo or not) that could cause a second attempted draft save to start before the first one completes.

@ntmagda I think that's unrelated to this issue, but I've added a fix for that (https://github.com/spicywebau/craft-neo/commit/1c19aff1a4b40812f5eaeb320523329970ab6887) in Neo 3.3.4.

maxstrebel commented 1 year ago

I guess this might have been several different issues at the end.

Thus I'll close this for now. And thanks for the scramble fix 🙏