Optimizing the document format for internal field management.

Thank you for your work and this useful library.

After reading though the Docs, FAQs and API I have an open question regarding the documents that are indexed.

When looking at the runtime behavior of the lib as a blackbox I asked myself: How do query results differ for arrays and objects when being indexed.

Reading into the code my conclusions are:

For the indexed document: object properties turn into fields.
fields are a kind of category and are holding n indexed entries that are search result candidates.
However field names are not searched though the same query pipeline / API.

As of now my documents follow a JSON-Schema that defines at several levels objects with arbitrary property names. These arbitrary names are unique across hundreds of documents and are relevant search terms themselves. That leads me to assume that:

Without converting my documents into another format the resulting search index would not be very useful. As thousands of fields would be created each in turn having only one result candidate. And the field names would not be processed as search terms.

It would be great if you could correct me if any of those assumptions are false and I have overlooked or misunderstood something about the API.

fergiemcdowall / search-index

Optimizing the document format for internal field management. #576