fedwiki / wiki-server

Federated Wiki client and server in Node.js
Other
153 stars 35 forks source link

Improving text extraction for search #183

Closed paul90 closed 4 months ago

paul90 commented 4 months ago

This update adds an extra filter to ensure all markup tags are removed before the content is indexed.

Using a copy of one of the larger wiki as a test, the size of the index export is reduced to about 20% of its previous size.