Problems Re-Indexing on Magento Cloud, indexes very large (magento 2.4.3) adding store views causes indexes to fail

leedyche commented 2 years ago

Im using magento 2.4.3 on the adobe cloud with elastic search 7.7 and elasticsuite 2.10.3 we have around 30,000 products and currently 2 store views.

Indexing the store from scratch takes a long time, at least 2 hours, when we only had the one store view it only took about an hour. I now want to add a 3rd store view, and have done so in staging. However im now struggling to get the indexes to build, they seem to get stuck in a cycle as if the indexer has been kicked off in background by something else, what then happens is multiple indexes are created for the same storeview with different time stamps, and the index process hangs or falls over.

I found that if i drop all the indexes and re-index with cron disabled, with indexes set to index on save, and the site put into maintenance mode, then sometimes I can manage to re-index and got the 2 store views indexed in staging. However it took 2 attempts (about 3 hours per attempt to index the 3 store views)

Is anyone else having similar issues? It seems my indexes are very large, so for 30,000 products the largest index has just over 500,000 records in it - is this normal? Ive started wading through the documents to see if i can find duplicates but havnt found any obvious issues yet

When i get the indexes built, it does work great, however Ive had to abort go live with my 3rd store view as the indexing failed 3 times in a row (taking about 7 hours) and i couldnt afford anymore downtime so was forced to remove the 3rd store view and re-index again

Adobe will not help as they wont support elasticsuite.

Any ideas how to get to the bottom of this would be greatly appreciated

thanks

Preconditions

Magento Version : 2.4.3 on adobe cloud

ElasticSuite Version : 2.10.3

Environment : Both

Third party modules :

Steps to reproduce

Drop elastic search indexes
Re-index
Index process never finishes, lots of indexes get created

Expected result

All indexes should be created correctly for each store view

Actual result

[Screenshot, logs]

vahonc commented 2 years ago

Hello @leedyche,

It seems that your issue is linked to the configuration of your environment. Therefore we need more details about this.

How many do you have different Customer Groups, nodes, shards, and replicas? What's the size of your indexes (use this command curl http://localhost:9200/_cat/indices?v)?

Also, you can use this command curl -XGET 'http://localhost:9200/_nodes?pretty' to get the needed others info, but not copy/paste all in the ticket. Or, you can give us the value of the admin interface parameter listing the nodes, like here.

Another point, as I understand your issue takes place on a staging environment, so maybe it's linked to some parameters like allocated memory, etc. Do you have any issues on the Prod environment?

Is anyone else having similar issues? It seems my indexes are very large, so for 30,000 products the largest index has just over 500,000 records in it - is this normal?

As for this, yes it's normal.

p.s. If you are interested in dedicated (paid) support through professional services, feel free to write an email to elasticsuite@smile.fr.

BR, Vadym

romainruaud commented 2 years ago

Hi @leedyche

in addition to what Vadym asked, I can confirm that, normally, such a catalog would take a couple of minutes to reindex, except if you have thousands of customer groups or categories.

By the way, a "record" (called a document) in elasticsearch is something a bit different of what you could have in SQL :

a product by itself is a document, but nested fields inside a product are also documents : each price, each category, each stock item, etc...

regards

leedyche commented 2 years ago

Hi thanks for the reponses

@vahonc we have 13 customer groups - we do have a large number of product categories (with a lot of nesting)

in terms of index size please see below, note we have 2 store views currently 'default' and 'nglishnosales'.

In terms of other info,

Number of nodes: 3 Shards per index: 3 Replicas per index: 2

To clarify - the only way i can create new store views (in live or staging) is to drop all elastic search indexes and re-index (with cron disabled) - this takes several hours on both live and staging, and everytime I add a store view it takes longer - I have monitored the indexes with _cat/indives?v as it runs and it is always the catalog_product_xxxxx index that takes the time to build - and often fails

I was able to set this up in staging, but when i tried to create a new store view in live, I couldnt get the indexes to re-build and had to roll back. If I can understand why the index is taking so long to build and reduce it to a few minutes or even 30mins that would be great

thanks

leedyche commented 2 years ago

futher to this, ive used the _count api to check, and we have 32k documents which is about the number of products we have, this is translating to 616,744 'documents' when including all the nested properties

rbayet commented 2 years ago

Hello @leedyche,

Could you provide us HEAP size configured for your nodes and confirm all 3 of them are configured as master/data ?

Anyway, considering you have 3 nodes and the total size of your product index, I don't really think you need 3 shards per index with 2 replicas per index : this means that each of your active indices has 9 shards in total. You could change the configuration to have only one primary shard per index and 2 replicas and still retain the ability to lose one node of the cluster without losing data/expected behavior.

If your nodes are configured with minimum HEAP size (say 512MB), it would help reach the good rule of thumb of not having more than 20 shards per GB of HEAP per node. Shards and documents allocation should be way faster.

Regards,

leedyche commented 2 years ago

Hi @rbayet,

thanks for the tips - I havent really messed with the memory config or nu,ber of shards / replicas myself (as adobe are meant to manage it) but I will see if I am able to modify these myself and run some tests on staging

in terms of the data you wanted the heap size of all nodes is as below

"mem" : { "heap_init_in_bytes" : 8589934592, "heap_max_in_bytes" : 8589934592, "non_heap_init_in_bytes" : 7667712, "non_heap_max_in_bytes" : 0, "direct_max_in_bytes" : 0 }

config for all is

"roles" : [ "ingest", "master", "data", "remote_cluster_client" ]

That comes from running the previous commands, i assume they are the correct fields?

cheers!

rbayet commented 2 years ago

Hello @leedyche,

Thanks for the extra information. So yes, you shouldn't in theory have any performance problem at indexing time with 8 GB heap per node, at least concerning the number of shards. And all your nodes are indeed master/data nodes (can be elected as master and will hold shard data).

There might be some external reasons :

the nodes being shared between multiple instances/project (and (many ?) more shards than those of your production environment)
memory undersizing of the machine hosting each node (50% memory should be kept for I/O)

Anyway, you should be able to change the primary/replica settings

either directly through the admin ( see https://github.com/Smile-SA/elasticsuite/wiki/ModuleInstall#indices-settings)
if you can't or those settings keep getting changed after each deployment through the Magento Cloud deployment configuration files by setting up the correct environment variables (see https://devdocs.magento.com/cloud/project/services-elastic.html#add-plugins-for-elasticsearch and the partially deprecated https://github.com/Smile-SA/elasticsuite/wiki/MeceInstall#by-using-an-environment-variable) either through the .magento.app.yaml or through the Magento Cloud interface

Regards,

github-actions[bot] commented 2 years ago

This issue was waiting update from the author for too long. Without any update, we are unfortunately not sure how to resolve this issue. We are therefore reluctantly going to close this bug for now. Please don't hesitate to comment on the bug if you have any more information for us; we will reopen it right away! Thanks for your contribution.

ivanaugustobd commented 1 year ago

Any news @leedyche? I'm facing a very similar problem with a catalog of 9.000+ categories (yes, categories, not products)

Here it needs 6gb of HEAP size to execute properly (I'm still trying different shards/replica settings to lower this)

leedyche commented 1 year ago

Any news @leedyche? I'm facing a very similar problem with a catalog of 9.000+ categories (yes, categories, not products)

Here it needs 6gb of HEAP size to execute properly (I'm still trying different shards/replica settings to lower this)

Hi @ivanaugustobd ,

In the end I had to abandon elasticsuite, we are running on adobe commerce cloud where the elastic search instance is managed for us and we have limited access to the infrastructure - trying to get support to debug this with us was a nightmare so in the end i had to look for other options - it was a shame as i liked the features.

In the end I wrote my own search module that uses a valinalla elastic search instance and was able to get much better perfromance in terms of indexing, and it allowed me to tailor it to our needs - if you get to the bottom of it please post as it would be good to have moving back to elastic suite as an option in the future

thanks

Smile-SA / elasticsuite