statamic / eloquent-driver

Provides support for storing your Statamic data in a database, rather than flat files.
https://statamic.dev/tips/storing-content-in-a-database
MIT License
104 stars 74 forks source link

Fetching 'Relationships' means fetching 25k+ models? #110

Closed geertjanknapen1 closed 12 months ago

geertjanknapen1 commented 1 year ago

So I have a Product collection which contains, you guessed it, products. These products can have a Product Category which is a taxonomy.

I get these products querying the EntryFacade, and then returning them to a view as $products.

Now comes the real kicker, the moment I call a foreach (in the blade) on $product->product_categories (the Taxonomy Term field within my Product blueprint), the debugbar tells 71 TermModels are fetched, and about 25302 EntryModels are fetched. Seems a bit excessive to me. Products will also have tags which it can have a lot more of, so if 1 taxonomy term is a problem, I can't image what about 5 of them will do..

So I'm left wondering, what am I doing wrong here? Why, when I call the product_categories on the product, are 25k+ records fetched? The Statamic install is multisite and supports 5 languages. I'm very new to Statamic and to the eloquent-driver package as well so if any advice could be as fool-proof as possible that would be great.

Mind you that this query is currently not cached, but caching the query makes such a minimal difference that it does not matter for the question.

Some screenshots and a snippet for clarification:

(cache_ttl is 1 for now)

$queryBuilder = Cache::remember('products', ENV('CACHE_TTL', 600), function () use ($collectionHandle) {
  return EntryFacade::query()
    ->where('collection', $collectionHandle) // $collectionHandle == 'products'
    ->where('published', true)
    ->where('site', app()->getLocale())
    ->get();
});

return $queryBuilder;

Database snip (will blur some things): image

Product blueprint: image image

View debugbar without querying product_categories : image

View debugbar with querying: image

ryanmitchell commented 1 year ago

Definitely a few too many queries... I've added some blink caching in this PR: https://github.com/statamic/eloquent-driver/pull/111

Can you pull it in and see how it affects your query and model counts?

geertjanknapen1 commented 1 year ago

@ryanmitchell Hey Ryan, will be checking this out today and report back, thanks in advance!

Edit: findings;

It does consistently knock off about 4k models, but that does mean it still loads 21k models when I query for the category, only 29 when I don't. Also seems to be a lot of repeated queries still, not as much as before but still.

Now please do keep in mind that this may full-well be a user error.

Even though I am grateful for your help, this has not fixed my problems just yet.

When not querying for the product_categories: image

image

When querying for the product_categories: image

image

geertjanknapen1 commented 1 year ago

bump

ryanmitchell commented 1 year ago

Sorry never got a notification (maybe as you edited rather than fresh posted). Can you show the duplicate queries, so I can see where they might be coming from?

geertjanknapen1 commented 1 year ago

Query tab amounts:

image

Mainly where in slug queries:

image

Some of these:

image

But I noticed it also does these sometimes, even though I always query with a where:

image
ryanmitchell commented 1 year ago

Ok thats useful - I think a lot of your extra queries are coming in from [applyCollectionAndTaxonomyWheres](https://github.com/statamic/eloquent-driver/blob/master/src/Taxonomies/TermQueryBuilder.php#L204). Maybe you could look at optimising that to minimise queries?

The schema query is coming from core, but could definitely be cached for optimisation. I'll work on this.

The duplicate entry queries are strange as those are definitely Blinked... are you able to see where they are being called as your line number doesn't sync with the current core Entry.php file.

geertjanknapen1 commented 1 year ago

@ryanmitchell Hey Ryan, thanks! I commented on your PR, and will try to look into this later, thanks! Regarding the queries not being Blinked, I have changed back to statamic 3 after testing since I'm working on multiple features at once. However, the amount of duplicated queries are almost the same with and without the Blinking

geertjanknapen1 commented 1 year ago

@ryanmitchell Hey Ryan, I've been tinkering around a bit.

I'm using the changes in your newest PR (https://github.com/statamic/cms/pull/7466 ).

The amount of queries (namely the duplicated ones) are still very high, however, all duplicated queries are now coming from /vendor/statamic/eloquent-driver/src/Entries/Entry.php:102. Which has to do with the origin I suppose. And from /vendor/statamic/cms/src/Query/EloquentQueryBuilder.php:402 and /vendor/statamic/cms/src/Query/EloquentQueryBuilder.php:58.

As for your suggestion regarding applyCollectionAndTaxonomyWheres, I do not understand Statamic and the Eloquent driver package enough to make any optimisations.

There's a good bunch of these, like 200+ maybe: image

And there's these which are related to not defining columns (L:58) and the Schemas caching (L:402: image

ryanmitchell commented 1 year ago

@geertjanknapen1 I've opened https://github.com/statamic/eloquent-driver/pull/123 to try and get around the duplicates on the Entry models you're seeing.

On the schemas - is that latest screenshot after using the PR on core? If so its not worked how I intended.

geertjanknapen1 commented 1 year ago

@ryanmitchell Correct that is the latest core PR indeed. I'll probably come back on #123 tomorrow if that's okay with you.

ryanmitchell commented 1 year ago

Interesting. I've updated the core PR to use Blink instead - can you update that and see if that helps at all?

geertjanknapen1 commented 1 year ago

After applying both the core PR with Blink and #123 it brought down the queries quite a bit, models still high though. It seems to do a query on the slug, and then a sub-query for the taxonomy with a where in slug for every slug in that taxonomy.

image

ryanmitchell commented 1 year ago

Progress at least. Theres definitely scope to improve the taxonomy query a little more, when I get some time I'll try and work on that.

geertjanknapen1 commented 1 year ago

Right progress indeed, thanks for the help and thanks in advance for the help yet to come!

ryanmitchell commented 1 year ago

@geertjanknapen1 a quick improvement here https://github.com/statamic/eloquent-driver/pull/124 - it should get rid of the duplicate select * from entries queries you were seeing.

I think this is about as much as I can do without it starting to affect the data returned - I cant think of any way of caching the taxonomy_term queries driver side.

mauricewijnia commented 1 year ago

I'm running into the same issue. I tracked the origin of the query back to this line of code in the CMS codebase. Disabling this line stops the select * from entries queries and prevents loading all those models.

I'm not sure if we can do something about it since I'm not entirely sure why the collection query would be necessary and I'm even less sure why this where statement results in a query for all collection entries...

ryanmitchell commented 1 year ago

@mauricewijnia are you using an entries field or something similar? The PRs I’ve referenced here should cut down the query count significantly.

mauricewijnia commented 1 year ago

@ryanmitchell I'm using the Terms field that gets attached to a collection blueprint when you link a taxonomy with a collection.

Limiting the query count helps, but I'm wondering if the query is necessary at all?

When I have some time I'll dive a bit deeper into this, for now I've extended the Terms field and disabled the collection query for our projects, not ideal but it works for now 😄

ryanmitchell commented 1 year ago

I think @FrittenKeeZ has done some work on the performance of the Terms field type (we dont use Taxonomies on our installs). Maybe he has a PR pending for that?

FrittenKeeZ commented 1 year ago

I've only done a bug fix in regards to the TermQueryBuilder - the performance issue I encountered was due to the Entries relationship augmenting all fields when in select mode instead of only id and title.

ryanmitchell commented 1 year ago

@geertjanknapen1 I would be hopeful that v2.2.0 reduces these queries significantly. Can you check now and let me know what you are seeing so we can either close this issue or find opportunities for further improvements?

geertjanknapen1 commented 1 year ago

Heya @ryanmitchell We've internally fixed this problem, so it will take me a while to properly test this without any of our custom code also interfering. I'm very busy at the moment but I hope I can get it done before the end of next week.

I will report whatever I find here and tag you.

Thanks for notifying me, I appreciate it!

ryanmitchell commented 1 year ago

No problem - looking forward to seeing the solutions you came up with.

ryanmitchell commented 12 months ago

Closing due to inactivity. Feel free to open a PR with the changes you've implemented.