Open proddata opened 3 years ago
If I understand this correctly it sounds a bit like a use-case for jsonpath support: https://www.postgresql.org/docs/12/functions-json.html#FUNCTIONS-SQLJSON-PATH
If I understand this correctly it sounds a bit like a use-case for jsonpath support: https://www.postgresql.org/docs/12/functions-json.html#FUNCTIONS-SQLJSON-PATH
I think it is/could be more than "just" an operator, but rather also a different indexing strategy.
Afaik right now the index points on e.g. skills['type'] to a document id. In order to quickly select on a specific element in an array, the index would need to point to an element within the array of a document.
e.g.
id | skills |
---|---|
1 | [ { "type": "SQL" , "level": 4}] |
2 | [ { "type": "SQL" , "level": 2}, { "type": "Java" , "level": 4}] |
Right now the index on ['type'] and ['level'] point to the document, but they also could point to the individual array element.
Use case:
As a user I want to be able to have information stored in arrays, but still be able to make a selection on logical combinations of individual object properties (e.g. I want to select all engineers that know "SQL" with a level of at least "3"):
Example 1 - Array comparsion
To find all engineers that know "SQL" and have skill with a level of at least "3", today one can individually filter on each column using array comparisons:
Example 2 - Object comparison and one can use an object comparison, to find engineers that have a
SQL
skill of e.g. exactly level 2which however is using basically a table scan and should only be used together with another index.
Example 3 - UNNEST + sub-SELECTs
To find all engineers that know "SQL" with a level of at least "3" one would need to use
unnest
and sub-selects:Too make it faster, one should probably also add a filter on the original array columns:
which makes the query itself rather complex.
As of now, one could only reduce the search space by using
ANY
on the individual columns, but can't actually represent a logical combination in the selection. On top of that one needs to use aUNNEST
which makes querying rather expensive.Feature description: see use case / t.b.d.
Maybe something like https://www.monterail.com/blog/how-to-index-objects-elasticsearch