Open umitgunduz opened 5 years ago
Hello @umitgunduz ,
Dynamic Fields are a feature of "Solr" which is an HTTP server on top of Lucene. Tantivy is more like lucene than it is Solr or Elasticsearch.
That said, I have built my own type of dynamic field code that then does the right things with tantivy, but I don't think Tantivy is the right place for that code.
Thanks @drusellers.
@umitgunduz reopening. @umitgunduz can you detail what you mean by dynamic field?
Hello @fulmicoton, In fact, I found the answer to similar problems when visiting the issues. https://github.com/tantivy-search/tantivy/issues/301 https://github.com/tantivy-search/tantivy/issues/385
The main problem is that must be re-index again all document when a new field is added in schema. This is a very cost effective method for large amounts of data and is not realistic.
Our need and expectations is a structure that has flexible schemas and store hierarchical data like document based nosql and has a search engine features: full-text search, facet, pivot etc. Also nested documents method is not a used and loved it method is very difficult. Because there is no fixed data structure in today's world and search engines are not just used only for text search. It's not easy to create such a structure I know maybe it's wrong to have it in a search engine library. However, most developers will love a structure in this way.
Thanks.
Dynamic Fields are a feature of "Solr" which is an HTTP server on top of Lucene. Tantivy is more like lucene than it is Solr or Elasticsearch.
That said, I have built my own type of dynamic field code that then does the right things with tantivy, but I don't think Tantivy is the right place for that code.
I tend to disagree. I have had a look at Toshi and Bayard, both copy the rigid schema definitions of tantivy. Lucene, on the other hand, does not enforce a rigid schema, you can add fields on the fly when adding documents, such as:
Document document = new Document();
document.add(new StringField("id", strId, Field.Store.YES));
document.add(new TextField("firstName", firstName , Field.Store.YES));
document.add(new TextField("lastName", lastName , Field.Store.YES));
document.add(new TextField("website", website , Field.Store.YES));
This makes the creation of dynamic fields very easy because you "just" have to nudge your application to handle fields that end with _t
to be text indexes (or whatever you defined). It would be tremendously helpful to have such fields - I frequently work with dynamic schemas that are not known in advance (e.g. user defined fields), and thus, this is a killer feature of Lucene (and Solr).
I do not know the inner workings of tantivy, but Lucene does not require a reindex of all the data once a new field is added, because indexes are sparse by definition. So, yes, I would definitely be happy if tantivy would support dynamic index fields that are defined in the schema (e.g. by giving them names like *_t
, just like Solr).
Hi, do you plan to add dynamic field like solr and we would like dynamic fields to support facet functionality. Thanks.