Closed camlcase closed 3 years ago
We´ve have managed to solve this by copying MainBody into a new string property and decode and strip the HTML. For the MainBody the indexing has been disabled.
[Searchable]
public string MainBodySearchable => (this["MainBody"] as XhtmlString).StripHtml();
Like the Episerver Search & Navigation, may I suggest an attribute to decode and strip HTML:
[RemoveHtmlTagsWhenIndexing]
public virtual XhtmlString MainBody { get; set; }
or by convention:
Indexing.Instance
.ForType<MyPage>().StripHtml(x => x.MainBody)
The Elasticsearch highlight built-in function used in search plugin sometimes returns broken HTML for XhtmlString indexed properties (like MainBody). I´ve seen discussions about this "issue" and one possible solution has been the HTML strip character filter while analyzing.
Anyway is there something we can do in this case, with the plugin, to solve this?
Episerver CMS version 11.20.0.0 Epinova.Elasticsearch version 11.7.3.139 Elasticsearch version 7.9.3