Added an optimized matching framework for standard SQL's LIKE

This work mainly involves modifying the implementation of like in tsfile to distinguish it from regexp and optimize it.

This implementation adds two core matching classes, LikePattern and LikeMatcher, as well as three matching methods, FjsMatcher, NFA, and DFA.

Matcher Overview

FjsMatcher: This uses KMP string search, which also uses BM's bad character rules to cut off some cases and speed up the string search process
NFA: Construct the corresponding NFA state machine according to the pattern, use the NFA state machine to match the input string, and there is backtracking
DFA: After constructing the NFA according to the pattern, simplify it into DFA, and use the DFA state machine to match the input string, and there is no backtracking

The main process is as follows:

Pattern parsing:
- Escape character processing
- Statistical information calculation, such as min and max, suffix and prefix
- Fuzzy suffix judgment
- Select matchers based on data features: only contains % and characters, use string search FjsMatcher; when it contains _, use the state machine, NFA is the default state, and DFA is the optimized state
Matching process
- Filter out strings that are not within the length range according to the statistical information min, max
- Filter out strings that do not meet the prefix and suffix information according to the statistical information suffix, prefix
- Matching using matchers

apache / tsfile