quickwit-oss / quickwit

Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
https://quickwit.io
Other
8.21k stars 336 forks source link

Please document schema evolution, and limitations #3174

Open tv42 opened 1 year ago

tv42 commented 1 year ago

As far as I know, the underlying Tantivy still does not allow changing the schema: https://github.com/tantivy-search/tantivy/issues/470 -- and this makes me wonder about Quickwit.

Reading https://quickwit.io/docs/configuration/node-config, nothing tells me whether I can edit these settings freely, and when any changes will take effect. For example, how can I tell when a field_mapping change from i64 to u64 has taken effect, will old documents be reindexed if I change tokenizer, and so on.

fulmicoton commented 1 year ago

It is not possible for the moment to change your schema. We are working on this.

For the moment the recommended way to stay "flexible" is to have a handful fields that are unlikely to change in the future (A typical one is the timestamp), and rely heavily on dynamic/schemaless mode for the rest of the schema.

In Quickwit 0.6 (expected at the end of htis month), schemaless fields will have very little limitation: it will be possible to have them as fast field, or multivalued for instance.

Letting you change schema or an indexer config is likely to happen in Quickwit before tantivy. An optimistic estimate would be Quickwit 0.7, in June. Quickwit is easier, as all split embed their own schema. We just need to make sure the merging logic does not apply between split with a different schema, and make sure the search logic is a little bit more lenient than it is today to make it happen.