commonsearch / cosr-back

Backend of Common Search. Analyses webpages and sends them to the index.
https://about.commonsearch.org
Apache License 2.0
123 stars 24 forks source link

Add different weights for parts of the page #5

Open sylvinus opened 8 years ago

sylvinus commented 8 years ago

Currently we have different weights/boosts for title, url and body text.

We should further split the body text to give a higher weight to text in h1-h6 titles for instance.

First question is how to store those different groups of text in Elasticsearch? Do we create as many fields as level of weights we can have?