studioespresso / craft-scout

Craft Scout provides a simple solution for adding full-text search to your entries. Scout will automatically keep your search indexes in sync with your entries.
MIT License
81 stars 55 forks source link

Setting for Element Types used in `SearchableBehavior::getRelatedElements` #242

Closed joshuapease closed 2 years ago

joshuapease commented 2 years ago

Thanks for all your hard work on Scout. It's been a go to plugin at the agency I work for.

I'd be happy to open a PR for this feature request if it sounds reasonable to include in Scout.


I'm working on a Craft Commerce site with about 250,000 SKUs.

We've recently started to hit memory limit exceeded errors for some of our Entries that have lots of relationships to Products.

Commenting out the following lines in SearchableBehavior::getRelatedElements resolved our issue.

https://github.com/studioespresso/craft-scout/blob/master/src/behaviors/SearchableBehavior.php#L150-L151

This got me thinking... for many use cases, checking all of these Element Types is overkill. Most of our indexes only have relations between Entry and Asset elements. In the case of Matrix blocks, those are always tied to an entry, so it would typically be rare to need to update one in isolation (unless an index was directly querying Matrix blocks).

Large sites could see some performance gains if they limited which Element Types are queried.

I've been playing around with this concept in a forked repo. The configuration would look something like this.

return [
    'application_id' => App::env('ALGOLIA_APP_ID'),
    'search_api_key' => App::env('ALGOLIA_SEARCH_API_KEY'),
    'admin_api_key' => App::env('ALGOLIA_ADMIN_API_KEY'),
    'relatedElementTypes' => [
        Entry::class,
        Asset::class,
    ],
];

// SearchableBehavior.php

public function getRelatedElements(): Collection
{
    if (!Scout::$plugin->getSettings()->sync) {
        return new Collection();
    }

    // Only use configured Element Types if configured
    if (!empty(Scout::$plugin->getSettings()->relatedElementTypes)) {
        return (new Collection(Scout::$plugin->getSettings()->relatedElementTypes))
            ->flatMap(function ($className) {
                return $className::find()->relatedTo($this->owner)->site('*')->all();
            });
    }

    // Fall back to checking all element types for relationships
}

If this is something you'd be interested in including in Scout I'd be happy to open a PR.

I have this working in a forked version of this plugin.

janhenckens commented 2 years ago

Hey @joshuapease, that could be a good solution, you're welcome to make a PR for it.

Are you aware of the option to set indexRelations to false, which completly removes the indexing of related elements? Not sure if that works for your use.

joshuapease commented 1 year ago

Apologies for the radio silence on this one. Here's a PR of the changes we made in our fork of the project. They've served us well the past year without any bugs.

We're up to 650,000 skus now 😅

janhenckens commented 1 year ago

Hey @joshuapease, thanks for the update & the PR. Which version of Craft and Scout are you running at the moment? There is a beta out for Craft 3 right now that moves all processing to a queue job (and the same feature is in the regular release for Craft 4)

joshuapease commented 1 year ago

Thanks @janhenckens! Those perf improvements by moving things to the queue look really beneficial.

Version numbers are a little fuzzy with the fork, but I believe we were on [2.7.2](https://github.com/studioespresso/craft-scout/releases/tag/2.7.2) when the issue started happening on our project.

However, I looked backed at our internal tickets that prompted the fork, and it looks like it was queue jobs that were failing, so I'm not sure if our issue would be fully resolved by the beta versions for Craft 3.

Some of the getRelatedElements() calls would try and return 150k+ Product elements.