silverstripe / silverstripe-fulltextsearch

Adds external full text search engine support to Silverstripe
BSD 3-Clause "New" or "Revised" License
44 stars 83 forks source link
hacktoberfest

FullTextSearch module

CI

Adds support for fulltext search engines like Sphinx and Solr to Silverstripe CMS. Compatible with PHP 7.2

Important notes when upgrading to fulltextsearch 3.7.0+

There are some significant changes from previous versions:

Draft content will no longer be automatically added to the search index. This new behaviour was previously an opt-in behaviour that was enabled by adding the following line to a search index:

$this->excludeVariantState([SearchVariantVersioned::class => Versioned::DRAFT]);

A new canView() check against an anonymous user (i.e. someone not logged in) and a ShowInSearch check is now performed by default against all records (DataObjects) before being added to the search index, and also before being shown in search results. This may mean that some records that were previously being indexed and shown in search results will no longer appear due to these additional checks.

These additional checks have been added with data security in mind, and it's assumed that records failing these checks probably should not be indexed in the first place.

Enable indexing of draft content:

You can index draft content with the following yml configuration:

SilverStripe\FullTextSearch\Search\Services\SearchableService:
  variant_state_draft_excluded: false

However, when set to false, it will still only index draft content when a DataObject is in a published state, not a draft-only or modified state. This is because it will still fail the new anonymous user canView() check in SearchableService::isSearchable() and be automatically deleted from the index.

If you wish to also index draft content when a DataObject is in a draft-only or a modified state, then you'll need to also configure SearchableService::indexing_canview_exclude_classes. See below for instructions on how to do this.

Disabling the anonymous user canView() pre-index check

You can apply configuration to remove the new pre-index canView() check from your DataObjects if it is not necessary, or if it impedes expected functionality (e.g. for sites where users must authenticate to view any content). This will also disable the check for descendants of the specified DataObjects. Ensure that your implementation of fulltextsearch is correctly performing a canView() check at query time before disabling the pre-index check, as this may result in leakage of private data.

SilverStripe\FullTextSearch\Search\Services\SearchableService:
  indexing_canview_exclude_classes:
    - Some\Org\MyDataObject
    # This will disable the check for all pagetypes:
    - SilverStripe\CMS\Model\SiteTree

You can also use the updateIsSearchable extension point on SearchableService to modify the result of the method after the ShowInSearch and canView() checks have run.

It is highly recommend you run a solr_reindex on your production site after upgrading from 3.6 or earlier to purge any old data that should no longer be in the search index.

These additional check can have an impact on the reindex performance due to additional queries for permission checks. If your site also indexes content in files, such as pdf's or docx's, using the text-extraction module which is fairly time-intensive, then the relative performance impact of the canView() checks won't be as noticeable.

Details on filtering before adding content to the solr index

Details on filtering when extracting results from the solr index

Requirements

Note: For Silverstripe 3.x, please use the 2.x release line.

Documentation

For pure Solr docs, check out the Solr 4.10.4 guide.

See the docs for configuration and setup, or for the quick version see the quick start guide.

For details of updates, bugfixes, and features, please see the changelog.

TODO