Ensure Text content paragraphs remain separate after search indexing

I have seen a situation (with AGSA/Tarnanthi) where some text entered on a page as a Text content item (HTML behind the scenes) becomes unsearchable because separate paragraphs of text are concatenated in the text document created during search indexing.

For example, a Text content item with the following HTML content This is a trickytest can get converted to This is trickytest with no whitespace between tricky and test by the default ICEkit search document template icekit/templates/search/indexes/icekit/default.txt. This would mean that subsequent searches for the words tricky and test may not find the page containing this Text content, depending on the word-stemming rules used on a site.

I think this is caused by the striptags filter used in that template, combined with HTML content generated by the Text widget without any newlines between HTML markup tags.

It can probably be best fixed by ensuring that  paragraph end tags generated by the Text component include a trailing newline character.

ic-labs / django-icekit

Ensure Text content paragraphs remain separate after search indexing #294