elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.6k stars 24.63k forks source link

Add a query/field compatibility matrix #71730

Open markharwood opened 3 years ago

markharwood commented 3 years ago

We have a growing catalogue of query types and field types but not all queries are supported in all fields. Neither field docs nor query docs want to deal with the messy business of documenting compatibility - documentation for queries do not list the field types that support that query type and the documentation for fields do not list the supported queries.

A simple support matrix with a yes/no check in each cell would help fill in a lot of the detail for users. Ideally this document would be automatically tested or even generated by checking against the software itself - documentation can get out of synch quickly.

Beyond a simple yes/no assertion it may prove necessary for the support matrix to document some of the nuances of certain query settings or behaviours when used on certain fields e.g

elasticmachine commented 3 years ago

Pinging @elastic/es-docs (Team:Docs)

markharwood commented 3 years ago

I have done some work to automate this and have produced the raw data for : 1) Several elasticsearch versions 2) All term-level query types 3) Many field types

A Python script runs all the required combinations and outputs reports on the matching behaviour.

The software and raw results are here . Checkout the readme.txt file for an overview. The results indicate if each combination of the above is supported or functioning correctly (some earlier versions had bugs and we still have known discrepancies).

I don't know what we want to do with the data - there's ~1000 cells in this matrix for just the 3 elasticsearch versions I've tested so far, before we consider adding all the "point" releases. It's far too much for a single HTML table but storing and analysing in elasticsearch/Kibana seems like an obvious way to allow slice and dice.

From poking around the data a high level summary is:

nik9000 commented 3 years ago

I think it'd be cool to have a test for each entry in the matrix. Just paranoia. I wonder if we can do them without hand generating them.

For what it's worth, its always bothered me that we don't document these together too. But we do try to make them fairly loosely coupled. Sort of. We don't try super hard.

markharwood commented 3 years ago

I think it'd be cool to have a test for each entry in the matrix. Just paranoia. I wonder if we can do them without hand generating them

Full test coverage would be good. The requirement differs a little compared to conventional tests because we want to document: 1) Bugs in older versions where there were no tests 2) Known and accepted shortcomings: a) unsupported combos (e.g. flattened field doesn't support wildcard). b) exceptions to rules e.g. term-level queries don't have case changed, unless on a keyword field with a normalizer.

ywelsch commented 3 years ago

I like the idea of turning this into a test suite (with particular attention on having as much coverage as possible).

The requirement differs a little compared to conventional tests because we want to document:

I think there's a good reason to make them conventional tests.

markharwood commented 3 years ago

@ywelsch I think there may be several concerns with back-porting conventional tests: