coldrye-collaboration / gitea

Working on some ideas to make a very good thing even better.
https://gitea.com
MIT License
0 stars 0 forks source link

Support fuzzy keyword title search #6

Open silkentrance opened 5 days ago

silkentrance commented 5 days ago

Feature Description

Bleve

ATM, the title is searched by using a bleve MatchPhraseQuery, which fails on searches for iss as in issue and so on. The MPQ is able to find camel cased words, though, but you will have to input the whole phrase, e.g. foo as in fooBar.

Also, the fuzziness is determined wrong by modules/indexer/internal/bleve/util.go/GuessFuzzinessByKeyword. bleve only supports a fuzziness of (0,1,2) and some sources state 3. Testing this, however, showed that a higher fuzziness will return too many results and that a value of 1 seems to be the sweet spot. The function also return values < 1 for keywords with a length < 4, so this should be addressed, too. Also, chinese and other languages need to be checked, as the function also bails out on unicode characters with a code point > 128, returning 0, basically limiting fuzzy searches to more or less standard ASCII.

DB Indexer

The db indexer uses LIKE '%...%' so this should not be changed.

Elasticsearch / Meilisearch

Elasticsearch and meilisearch need to be addressed, too.

Rationale

The rationale for limiting fuzzy search to title only is that one much easier remembers the title than what is inside a wall of text such as the content of comments or the issue itself. Also, the fuzzy search is slower than standard MPQ

Resources

Original issues

Pull Request

silkentrance commented 5 days ago

Image

Image

Image