Closed YoannMR closed 5 years ago
Since stemmed variants of default catch all field are now stored for thesaurus recommender and so can be used for highlighting of not stemmed fields like content_txt, too there is no need for s stemmed content fields for highlighting stemmed variants in search UI anymore, so newest releases use only an second language specific default/catch all field (for example text_txt_en) for stemming and ETL does not copy each other field to a stemmed variant anymore.
Hi,
I recently moved to the latest version of OSS and noticed that the SOLR index is significantly bigger (at least 2x larger).
I compared the fields in the two versions: while there are many more fields in the new version, I noticed that the content of the document is duplicated 4 times (in "text", "text_txt_en", "content_txt" and "content_txt_txt_en").
Is there a need for this duplication? The previous version only had 'content' with the document text.
Thanks for your help!