coreshop / CoreShop

CoreShop - Pimcore enhanced eCommerce
http://www.coreshop.org
Other
276 stars 157 forks source link

findRecursiveVariantIdsForProductAndStoreByProducts Performance / Bug #2519

Closed solverat closed 9 months ago

solverat commented 9 months ago
Q A
Bug report? yes
Feature request? no
BC Break report? no
RFC? no

Performance

https://github.com/coreshop/CoreShop/blob/dd7aadeeeb5af284d9e47be47719d4db9a94bb94/src/CoreShop/Bundle/CoreBundle/Pimcore/Repository/ProductRepository.php#L64-L71

To demonstrate, here's a product which we need to display with all its possible variant prices in a table:

With enabled findRecursiveVariantIdsForProductAndStoreByProducts

image

However, improving the query doesn't change a thing. The real problem is, that (depending on the project and price rules you're using) this query gets executed a million times. This leads to database crashes.

Bug

I also noted, that this method does not respect products given store!

Fix

Cache Entry

Instead of executing the same query over and over, we've added a short cache item:

$cacheKey = sprintf('cc_rvips_%s_%d', md5(implode('-', $products)), $store->getId());
Cache::save($resultProducts, $cacheKey, [], 500, 0, true);

Summary

$cacheKey = sprintf('cc_rvips_%s_%d', md5(implode('-', $products)), $store->getId());

if(false !== $cacheData = Cache::load($cacheKey)) {
    $resultProducts = $cacheData;
} else {

    /** @psalm-suppress InternalMethod */
    $query = sprintf('
        SELECT oo_id as id FROM (
            SELECT CONCAT(o_path, o_key) as realFullPath FROM objects WHERE o_id IN (:products)
        ) as products
        INNER JOIN %s variants ON variants.o_path LIKE CONCAT(products.realFullPath, "/%%")
        WHERE variants.stores LIKE :store
    ', $dao->getTableName());

    $params = [
        'products' => $products,
        'store'    => '%,' . $store->getId() . ',%'
    ];

    $paramTypes = [
        'products' => Connection::PARAM_STR_ARRAY,
    ];

    $resultProducts = $this->connection->fetchAllAssociative($query, $params, $paramTypes);

    Cache::save($resultProducts, $cacheKey, [], 500, 0, true);
}

// [...]
dpfaffenbauer commented 9 months ago

@solverat that is new function right? did you try the old one too? was that faster or slower for you?

dpfaffenbauer commented 9 months ago

Caching usually is fine, but you should also think about invalidation and pre-populating the cache

solverat commented 9 months ago

Yes. It was (3 times) faster - but in the end it was not enough on our side. The amount of executed queries is/was the issue.

Regarding invalidation: In my example, I set the lifetime to 500 seconds. However, not sure how to solve it in core. We could add an attribute int $cache = 0. 0 means no caching and add a new option to rule condition, where the user can decide how long this should get cached.

dpfaffenbauer commented 9 months ago

wow, in my case the old one was x times slower. interesting. I will do some tests with loads of data. I have a project with more than 1,3 million sku's, so there I'll see performance issues

dpfaffenbauer commented 9 months ago

Yeah ok, when I run that query with a the first level of products, it takes about 12 seconds to get 1.3 million ids. Obviously it adds up if you have multiple heavy queries like that. So caching is a pretty good idea there. What if we add a event-handler to rule save and pre-populate the cache with a lifetime of about half a day?

solverat commented 9 months ago

Yes, thats exactly what i did:

public static function getSubscribedEvents(): array
{
    return [
        'coreshop.product_price_rule.post_save'          => ['invalidateRuleCache'],
        'coreshop.product_specific_price_rule.post_save' => ['invalidateRuleCache'],
        'coreshop.cart_price_rule.post_save'             => ['invalidateRuleCache'],
        'coreshop.payment_provider_rule.post_save'       => ['invalidateRuleCache'],
        'coreshop.shipping_rule.post_save'               => ['invalidateRuleCache'],
    ];
}

public function invalidateRuleCache(ResourceControllerEvent $event): void
{
    Cache::clearTag(ProductRepository::VARIANT_RECURSIVE_QUERY_CACHE_TAG);
}

I set the lifetime to 15min, if an admin moves a sub product out of scope he needs to wait 15 mins until the rule won't apply on it. The naming is a bit misleading, we're not only talking about variants but sub products too.

dpfaffenbauer commented 9 months ago

I'll make a PR. I want to make a new release tomorrow, so let's see if we make it until then ;)

dpfaffenbauer commented 9 months ago

@solverat

With something like that, we could pre-populate the cache. Problem is, how do we accurately know when to re-calculate it?

public function __invoke(RecursiveVariantIdCalculationMessage $message): void
    {
        $stores = $this->storeRepository->findAll();

        //Populate Cache for all stores and all Product Types
        foreach ($stores as $store) {
            foreach ($this->productStackRepository->getClassIds() as $classId) {
                $class = ClassDefinition::getById($classId);

                if (!$class) {
                    continue;
                }

                $classRepo = $this->objectManager->getRepository($class->getName());

                if (!$classRepo instanceof ProductVariantRepositoryInterface) {
                    continue;
                }

                //Warmup Cache
                $classRepo->findRecursiveVariantIdsForProductAndStoreByProducts($message->getProducts(), $store);
            }
        }
    }
solverat commented 9 months ago

We can't i guess. Just invalidation when updating rules is possible.

dpfaffenbauer commented 9 months ago

@solverat https://github.com/coreshop/CoreShop/pull/2520