alexshyba / SitecoreSearchContrib

Extension to Sitecore.Search namespace. Includes AdvancedDatabaseCrawler and Searcher. Make sure to check out the website for the project.
http://sitecorian.github.io/SitecoreSearchContrib
25 stars 21 forks source link

Defining parameters on multi-database indexes #17

Closed kamsar closed 11 years ago

kamsar commented 11 years ago

I've figured out a nice trick for defining indexes that span multiple databases (eg master+web, as in the "demo" index included with the source). You can use the "ref" feature to reference a location config twice and simply change the database reference, thus avoiding the need to redefine the field types, field crawlers, etc for each database:

<configuration xmlns:x="http://www.sitecore.net/xmlconfig/">
    <sitecore>
        <search>
            <configuration>
                <indexes>
                    <index id="general" type="Sitecore.Search.Index, Sitecore.Kernel">
                        <param desc="name">$(id)</param>
                        <param desc="folder">$(id)</param>
                        <Analyzer ref="search/analyzer"/>
                        <locations hint="list:AddCrawler">
                            <master ref="search/templates/general-index-template" param1="master" /> <!-- this refers to the node below and passes it "master" as $(1) -->
                            <web ref="search/templates/general-index-template" param1="web" />
                        </locations>
                    </index>
                </indexes>
            </configuration>
            <templates>
                <general-index-template type="scSearchContrib.Crawler.Crawlers.AdvancedDatabaseCrawler,scSearchContrib.Crawler">
                    <Database>$(1)</Database>
                    <Root>/sitecore/content</Root>

                    <fieldTypes hint="raw:AddFieldTypes">
                        <!-- Text fields need to be tokenized -->
                        <fieldType name="single-line text" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" />
                        <fieldType name="multi-line text" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" />
                        <fieldType name="word document" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
                        <fieldType name="html" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
                        <fieldType name="rich text" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
                        <fieldType name="memo" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
                        <fieldType name="text" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" />
                        <fieldType name="checkbox" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" />
                        <!-- Multilist based fields need to be tokenized to support search of multiple values -->
                        <fieldType name="multilist" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" />
                        <fieldType name="treelist" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" />
                        <fieldType name="queryable treelist" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" />
                        <fieldType name="treelistex" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" />
                        <fieldType name="checklist" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" />
                        <!-- Legacy tree list field from ver. 5.3 -->
                        <fieldType name="tree list" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" />
                    </fieldTypes>

                    <fieldCrawlers hint="raw:AddFieldCrawlers">
                        <fieldCrawler type="scSearchContrib.Crawlers.FieldCrawlers.LookupFieldCrawler,scSearchContrib.Crawler" fieldType="Droplink" />
                        <fieldCrawler type="scSearchContrib.Crawlers.FieldCrawlers.LookupFieldCrawler,scSearchContrib.Crawler" fieldType="Droptree" />
                        <fieldCrawler type="scSearchContrib.Crawlers.FieldCrawlers.DateFieldCrawler,scSearchContrib.Crawler" fieldType="Datetime" />
                        <fieldCrawler type="scSearchContrib.Crawlers.FieldCrawlers.DateFieldCrawler,scSearchContrib.Crawler" fieldType="Date" />
                        <fieldCrawler type="scSearchContrib.Crawlers.FieldCrawlers.NumberFieldCrawler,scSearchContrib.Crawler" fieldType="Number" />
                    </fieldCrawlers>

                    <dynamicFields hint="raw:AddDynamicFields">
                    </dynamicFields>
                </general-index-template>
            </templates>
        </search>
    </sitecore>
</configuration>

The "ref" can point to anywhere in the config document, I just used a templates node as it seemed logical.

This isn't really a bug in the code, but it might help some folks. It could be used to refactor the demo config file's "demo" index to only need to define stuff once.

techphoria414 commented 11 years ago

Awesome, perhaps Alex can add this to the wiki for scSearchContrib.

alexshyba commented 11 years ago

Great find indeed! Thanks, @kamsar. I've modified the scSearchContrib.Crawler.config file with your suggestion. Looks much cleaner now. https://github.com/sitecorian/SitecoreSearchContrib/blob/master/scSearchContrib.Crawler/App_Config/Include/scSearchContrib.Crawler.config

motoyugota commented 11 years ago

Does anyone know if it is possible to do something similar to this (using "ref=...") for sections within the crawler config, such as the contents of ""? We have indexes that vary in their dynamic fields and included templates, but the base field types are all configured 100% the same, and it seems like a huge was to have to copy and maintain the fieldTypes and fieldCrawlers sections multiple times.