Better explain the difference between record_css_selector and attributesToIndex settings

nhoizey commented 7 years ago

I still don't understand how attributesToIndex can contain headings when the default record_css_selector only targets <p>s…

pixelastic commented 7 years ago

You're right that when presented that way it sounds weird.

The way it works is that we read all record_css_selector in the page, and for each of them we build the hierarchy of parent title (h1, h2, etc). The plugin will also get other information from the page, like its url, and any information present in the front matter. It will pack all that into a JSON object and send that to Algolia.

The attributesToIndex lists all the keys of the JSON that should be searchable, by order of importance. by default, we search into the actual content, and all the hX hierarchy. We don't search into urls or HTML versions for example. Being able to override attributesToIndex is to allow people that have specific fields in the front matter to be able to search through them.

As for naming, I agree that record_css_selector might not be the most self-explanatory naming I could have come with... We also renamed attributesToIndex to searchableAttributes to make it more explicit (both still works, be we only use searchableAttributes in new versions and documentation).

Does that make more sense? I'll add a note about that in the README if you think it explain things better. If not, or if you think I should add something (like example and use-cases), let me know.

pixelastic commented 6 years ago

The new README of jekyll-algolia explains a bit more how we extract the headings. I've also renamed record_css_selector to nodes_to_index in the new version, I hope it makes things clearer.

algolia / algoliasearch-jekyll

Better explain the difference between record_css_selector and attributesToIndex settings #56