sirensolutions / siren-join

[This is the old, single node version for Elasticsearch 2.x, see the latest "Siren Federate" plugin for distributed Elasticsearch 5.x and 6.x capabilities]
http://siren.io
GNU Affero General Public License v3.0
183 stars 60 forks source link

Option to force the encoding of terms as integer instead of long #25

Closed rendel closed 8 years ago

rendel commented 8 years ago

We can automatically detect when terms can be encoded as integer instead of long using the max value returned by the field stats. In addition, when integer encoding can be applied, we could use a vint encoding approach to compress the set of terms, as it is more likely that these integers will not follow a random distribution (e.g., generated by hash) compared to a long value (which is likely to be a generated hash).

rendel commented 8 years ago

Reverting back to the original idea, it is generally better to leave the control to the user. Also, Kibi can easily detect when a field can be encoded with integers, and can recommend it to the user.

scampi commented 8 years ago

Should I open an issue in kibi to support this feature ? So it can give a warning.

rendel commented 8 years ago

Yes, I think we need an umbrella issue for the integration of the siren-join plugin with Kibi. At the moment Kibi is missing:

scampi commented 8 years ago

opened in https://github.com/sirensolutions/kibi-private/issues/115