silverstripe / silverstripe-fulltextsearch

Adds external full text search engine support to Silverstripe
BSD 3-Clause "New" or "Revised" License
44 stars 83 forks source link

RFC: Search index update/new searcher on auto commit #331

Open michalkleiner opened 2 years ago

michalkleiner commented 2 years ago

Following on from/coming back to https://github.com/silverstripe/silverstripe-fulltextsearch/pull/278 and https://github.com/silverstripe/silverstripe-fulltextsearch/issues/274, it seems either something changed in Silverstripe Cloud Platform or the change in the merged PR worked intermittently/by accident, but there were several sites with issues relating to being able to find newly published content changes through the search on sites on SCP. Full reindex works, but publishing a page doesn't cause the content change to become visible in search.

The latest advice from SCP support is to enable the openSearcher option within the autoCommit config, causing the searchers to reload after each auto commit. The setting can be enabled by copying the solrconfig.xml into the project code, adjusting the config there to override the default, and pointing Solr to it to use when configuring the core as a workaround.

<autoCommit>
    <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
    <openSearcher>true</openSearcher>
</autoCommit>

Should this become the default? Would it have any unwanted implications for systems/platforms, possibly loading up new searchers too many times?

michalkleiner commented 1 year ago

@silverstripe/core-team could you please express your views on this?

The above snippet is what SCP technical documentation suggests (not what was actually merged in #278) even though they mention prior to 3.11 where #278 change was first released.

With the community moving more towards Elastic or using different modules for newer versions of Solr, and CWP contract no longer in place, I see lower risk in this change than before.

You can 👍 for yes or 👎 for now on this comment, thanks!

emteknetnz commented 1 year ago

I regression tested this fairly recently using fulltextsearch 3.11.1 - I did NOT include the snippet above in my project, as the linked SCP technical documentation it's on required if you're running less than 3.11.0

Finally, if using a version prior to 3.11.0 for FulltextSearch, you will need to update your configuration in order for the Solr results to automatically commit ... (xml snippet)

My experience was that publishing a page with edits on it did result in it being committed to the solr index, though it took roughly 10 minutes for it to actually appear in the search results.

So I don't think we need this RFC since things are working as they should be?

michalkleiner commented 1 year ago

My experience (and shared experience from other vendors) from several projects is that sometimes the searcher doesn't pick it up at all (it's in the index, but doesn't come up in results) and the snippet above helped.

So my question is more about "is the change in the snippet problematic" or "it won't do any harm when used" anyway, so if it helps some sites have results faster, why not have it.

emteknetnz commented 1 year ago

I'm all in favor of the change, it's just as far as I can tell this change has already been done and is the default i.e. we've already added the snippet

from several projects

Do you know if they were running fulltextsearch 3.11.0 or above?

michalkleiner commented 1 year ago

We ran a fork with Chillu's change and what helped in the end was the snippet here, don't know about the other sites, but I can ask this week.

michalkleiner commented 1 year ago

Just to avoid confusion — the above snippet is different to what is in 3.11.0. Here it's about starting a new searcher whereas in 3.11 we changed the auto soft-commit from never to 10 minutes.

emteknetnz commented 1 year ago

Oh right, sorry, yes I got confused - https://github.com/silverstripe/silverstripe-fulltextsearch/pull/278 added

<autoSoftCommit>
    <maxTime>${solr.autoSoftCommit.maxTime:60000}</maxTime>
</autoSoftCommit>

Which is different from the snippet above + more surprisingly what's in the SCP technical docs says which did mention that you don't need need the snippet if you're on 3.11.0 or above (the docs says change the value of <autoCommit>.<openSearcher> to true, it does not mention <autoSoftCommit>)

They both do roughly the same thing, though there's differences. Here's an indepth article on autoCommit vs autoSoftCommit

Based on the investigation Mateusz previously did - I'm going to make the assumption that there was a fair bit of consideration done at some point on the performance implications of using hard vs soft commits on CWP. I'm somewhat inclined to not change things at this point if things are working, unless we can have some stats to show that making this change won't have an overly negative impact and result in a better search experience

LABCAT commented 1 year ago

I have also been investigating an issue today related to the fact the published changes don't affect search results immediately.

I can confirm that adding: <openSearcher>true</openSearcher>

to my configuration and was actually advised by the SCP support desk to this.

Given that the SCP support desk is giving this advise I would suggest the this should be included in the configuration by default.

rob-mcgrail commented 9 months ago

Given that this seems to be required, in order for this module to work as expected (with Solr as configured on Silverstripe's own cloud platform), this should really be the default.

rob-mcgrail commented 9 months ago

On a SS 4.13 site on Silverstripe Cloud, using the latest 3.x of this module, I can confirm that <openSearcher>true</openSearcher> was required for search changes to become available. I ended up forking to confirm this, as the extraspath way of overriding the config didn't work for me (although that could be user error on my part).