matrixorigin / matrixone

Hyperconverged cloud-edge native database
https://docs.matrixorigin.cn/en
Apache License 2.0
1.78k stars 276 forks source link

[Tech Request]: Improve testing of Query Plans for Master/Vector/Secondary Index #15019

Open arjunsk opened 7 months ago

arjunsk commented 7 months ago

Is there an existing issue for the same feature request?

Is your feature request related to a problem?

The performance of Index queries is not tested in regression tests. We need a strategy to test
- Plan generated by Index Queries
- Data stored inside the Index Hidden table
- Performance metrics of Index Queries over different runs

Describe the feature you'd like

Possible approaches to solve this

Describe implementation you've considered

No response

Documentation, Adoption, Use Case, Migration Strategy

No response

Additional information

No response

arjunsk commented 7 months ago

As suggested by @fengttt , one approach is to support

EXPLAIN FORMAT=JSON select * from t3;. Then query the JSON string to do JSON path etc to see if BlockFilter is present or not etc.

Reference

  1. JSON Extract
SELECT *
FROM your_table
WHERE JSON_EXTRACT(your_json_column, '$.your_json_path') REGEXP 'your_regex_pattern';**
arjunsk commented 7 months ago

I don't think we support REGEX, LENGTH, etc functions on Query Plan output from "EXPLAIN".

arjunsk commented 6 months ago

A. For Vector Index

  1. Search Query Correctness BVT is improved by using precise coordinates: https://github.com/matrixorigin/matrixone/blob/main/test/distributed/cases/array/array_index_knn.sql
  2. Vector Insert Performance will monitored with this test: https://github.com/matrixorigin/matrixone/issues/15018
  3. Vector QPS performance needs to be evaluated with this test: https://github.com/matrixorigin/matrixone/issues/15781

B. For Master Index No plan yet

C. For Secondary Index No plan yet