studioespresso / craft-scout

Craft Scout provides a simple solution for adding full-text search to your entries. Scout will automatically keep your search indexes in sync with your entries.
MIT License
81 stars 54 forks source link

[5.0.0-beta1] Refreshing index or saving entry in section causes Memory allocation error: 1038 Out of sort memory #298

Closed zizther closed 1 month ago

zizther commented 2 months ago

Issue

Since upgrading to Craft 5 from Craft 4 and to Scout 5.0.0-beta.1 I am encountering an issue where trying to refresh an index, which looks to get all entries from one section that contains 3000 entries, causes a MySQL issue Memory allocation error: 1038 Out of sort memory.

Refreshing another index which gets all users, of which there are 550 works fine.

I don't think that increasing server sort buffer size would be necessary. The index refresh is working fine on Craft 4.9.1 and Scout 4.0.0.

Resaving a single entry to be indexed also causes the same error for the section.

The entry is only made up of plain text, lightswitch, dropdown and WYSIWYG fields, no matrix fields.

The error message from the queued job is:

SQLSTATE[HY001]: Memory allocation error: 1038 Out of sort memory, consider increasing server sort buffer size The SQL being executed was: SELECT `elements`.`id`, `elements`.`canonicalId`, `elements`.`fieldLayoutId`, `elements`.`uid`, `elements`.`enabled`, `elements`.`archived`, `elements`.`dateLastMerged`, `elements`.`dateCreated`, `elements`.`dateUpdated`, `elements_sites`.`id` AS `siteSettingsId`, `elements_sites`.`siteId`, `elements_sites`.`title`, `elements_sites`.`slug`, `elements_sites`.`uri`, `elements_sites`.`content`, `elements_sites`.`enabled` AS `enabledForSite`, `entries`.`sectionId`, `entries`.`fieldId`, `entries`.`primaryOwnerId`, `entries`.`typeId`, `entries`.`postDate`, `entries`.`expiryDate` FROM (SELECT `elements`.`id` AS `elementsId`, `elements_sites`.`id` AS `siteSettingsId` FROM `elements` `elements` INNER JOIN `entries` `entries` ON `entries`.`id` = `elements`.`id` INNER JOIN `elements_sites` `elements_sites` ON `elements_sites`.`elementId` = `elements`.`id` WHERE (`entries`.`sectionId`=4) AND (CAST((`elements_sites`.`content`->>'$.\"d265d10c-62aa-4ab7-961b-309ad1f8f1dd\"') AS CHAR(5))='true') AND ((CAST((`elements_sites`.`content`->>'$.\"2781287a-7531-40c6-aedc-e6df10e65710\"') AS CHAR(5))='false') OR (CAST((`elements_sites`.`content`->>'$.\"2781287a-7531-40c6-aedc-e6df10e65710\"') AS CHAR(5)) IS NULL)) AND (((`elements`.`enabled`=TRUE) AND (`elements_sites`.`enabled`=TRUE)) AND (`entries`.`postDate` <= '2024-05-04 14:11:59') AND ((`entries`.`expiryDate` IS NULL) OR (`entries`.`expiryDate` > '2024-05-04 14:11:59'))) AND (`elements`.`archived`=FALSE) AND (`elements`.`dateDeleted` IS NULL) AND (`elements`.`draftId` IS NULL) AND (`elements`.`revisionId` IS NULL) ORDER BY `entries`.`postDate` DESC, `elements`.`id` DESC) `subquery` INNER JOIN `elements` `elements` ON `elements`.`id` = `subquery`.`elementsId` INNER JOIN `elements_sites` `elements_sites` ON `elements_sites`.`id` = `subquery`.`siteSettingsId` INNER JOIN `entries` `entries` ON `entries`.`id` = `subquery`.`elementsId` ORDER BY `entries`.`postDate` DESC, `elements`.`id` DESC

Versions

Scout version: 5.0.0-beta.1 Craft CMS version: 5.1.1

janhenckens commented 2 months ago

Hey @zizther, do you have "index relations" enabled? And if you're able, could you share your index & transformer setup? Thanks!

zizther commented 2 months ago

@janhenckens Thanks for getting back so quickly. I have tried with and without, the issues happens with either.

I also tried playing around stripping it back to just indexing an ID, that still caused the error. The only way i've found to stop the error is to limit the criteria query to 500, then it does it with ease.

Here is the scout config I am using. The guard behaviour adds some light params for consistency across the site. scout.php.zip

janhenckens commented 2 months ago

Thanks, I'll have a look at your config.

This also got me thinking that Scout could/should be taking advantage of the batched queue jobs that were added in 4.6(?), at least for the larger jobs like a full index refresh.

zizther commented 2 months ago

Thanks. I think the transformer is fairly light, and that generally this shouldn't be difficult for Craft

Nice idea about the batched jobs

janhenckens commented 2 months ago

Minor question @zizther but I see you have "storyReadTime" as a property in your transformer. What type of field is that?

zizther commented 2 months ago

@janhenckens this is just a plain text field. The content in the entry is only plain text or CKEditor fields

janhenckens commented 2 months ago

@zizther Tested this with a similar content with about 6000 entries and I'm not seeing anything special (running locally on DDEV).

Not sure if you have any other plugins or modules enabled, but if you're able to, could to try disabling everything but scout and see if the error still pops up?

zizther commented 1 month ago

@janhenckens Thanks for testing that out. I have tested on my local DDEV, disabling all plugin, apart from Scout, and custom modules.

The queue log didn't really show too much more than what i've already mentioned, however I have emailed this over to you.

This was a Craft 4.9 to 5.1 update, not sure if this could have caused any issues in relation to the data management.

janhenckens commented 1 month ago

Tested with the site/database in question, here are my findings:

Thus the issue seems related to that specific entry and Craft 5, not to Scout.

I'll close this issue for now - feel free to reference this off course should be raise the issue with P&T!

zizther commented 1 month ago

Thanks for your help looking into this. Upon further investigation, this is an issues with Craft 5 itself and the queries based on the volume of content in the query.

It just so happened to be the Scout plugin was the first it show there was an issue.