pingcap / tiflash

The analytical engine for TiDB and TiDB Cloud. Try free: https://tidbcloud.com/free-trial
https://docs.pingcap.com/tidb/stable/tiflash-overview
Apache License 2.0
944 stars 409 forks source link

Enhance minmax index to support more data types. #6016

Open hongyunyan opened 2 years ago

hongyunyan commented 2 years ago

Enhancement

Our minmax index now is only support for the handle column, and the columns whose data type satisfies isInteger() or isDateorDateTime()).

For lots of other data types, such as float, double, decimal, string, etc, minmax index is unsupported currently.

In order to enhance our index ability, we first can enhance our minmax index for more data types, to utilize our minmax index better.

hongyunyan commented 2 years ago

/assign

JaySon-Huang commented 2 years ago

For float, double - Comparing floating-point values is not an accurate operation, can we ensure it won't filter the wrong result? How does other database systems handle that? I think when comparing floating-point column to a value, the executor usually cast the floating-point value to a decimal value for comparing, which could make the min-max index of floating-point invalid.

For string

There are more kinds of rough set indexes (or some systems call them zone-maps) other than min-max that could help us better filter the irrelevant data when executing queries.