SSHOC / sshoc-marketplace-backend

Code for the backend
Apache License 2.0
2 stars 0 forks source link

When is a re-index for Solr started? #148

Closed dpancic closed 1 year ago

dpancic commented 2 years ago

In GitLab by @KlausIllmayer on Jan 18, 2022, 16:29

I'm a little bit unclear about the current situation regarding the full re-index of Solr. I had the impression, that a full re-index is only initiated when administrator calls PUT /api/item-reindex or when backend-container on intial start finds out that the Solr is empty. (and there should be also a re-index in the night on base of a cronjob but that seems not to touch the already existing Solr index)

The last days have led to question my assumption on reindex of Solr at intial start of the backend-container. Could it be, that every time the backend-docker-container is deployed an automatic re-index of Solr is initiated? Thus (sometimes partly?) deleting already existing solr index in the persistent solr-part of the backend-docker-bundle and making search endpoints to deliver no (or only partly) data?

@tparkola Is is possible to have a look into the code and to sum up the situations when a sorl reindex is initiated and under what circumstances this happens?

dpancic commented 2 years ago

In GitLab by @KlausIllmayer on Jan 18, 2022, 16:56

And in general the question is if there could be a better setup so that we don't have a gap where Solr does not deliver back results.

In a discussion @olanowak proposed: But maybe it would be possible to avoid it? I don't really know how MP works internally but maybe if you had master-slave config of solr that would help? Frontend conected to slave and reindexing only on master. When it's finished -> replicate the index? (or maybe there is something even better with cloud solr setup?)

dpancic commented 2 years ago

In GitLab by @KlausIllmayer on Jan 18, 2022, 17:01

mentioned in issue sshoc-marketplace#97

dpancic commented 2 years ago

In GitLab by @tparkola on Jan 19, 2022, 09:15

From what I see in the code ItemReindex:

-after source delete (Event listener)

-after actor delete (Event listener)

-in MarketplaceStartupRunner - when starting backend of maketplace

Concept and Actor reindex only on backend start up.

dpancic commented 2 years ago

In GitLab by @vronk on Jan 23, 2022, 22:54

Is this MarketplaceStartupRunner necessary? Restarting backend is often triggered by the underlying infrastructure without any change to the code and that produces seemingly unnecessary unavailability of the index.

dpancic commented 2 years ago

In GitLab by @tparkola on Jan 28, 2022, 13:48

MarketplaceStartupRunner index all data in the so in the starting of application. Therefore, if you shut down backed application, for example to make some updates, then you want to have any data indexed in Solr, because solar is dependant to back-end. In conclusion MarketplaceStartupRunner is necessary for indexing data at first.

dpancic commented 2 years ago

In GitLab by @tparkola on Feb 4, 2022, 11:54

About solr reindexing I found 2 thinks that can be responsible for excessive reindexing.

Therefore I did remove reindexing after removing source or actor.

As for the process of indexing itself, I managed to reduce the time, by replacing some functionalities with appropriate db queries.

@KlausIllmayer @vronk