meilisearch / charabia

Library used by Meilisearch to tokenize queries and documents
MIT License
261 stars 89 forks source link

Fix unused FstSegmenter warning when not using khmer compiler features #261

Closed timvisee closed 10 months ago

timvisee commented 10 months ago

Pull Request

Related issue

Fixes https://github.com/meilisearch/charabia/issues/260 Depends on https://github.com/meilisearch/charabia/pull/259

What does this PR do?

PR checklist

Please check if your PR fulfills the following requirements:

Thank you so much for contributing to Meilisearch!

irevoire commented 10 months ago

bors merge

meili-bors[bot] commented 10 months ago

Build succeeded:

timvisee commented 10 months ago

I didn't know qdrant was using charabia; that's awesome! I love your work

:smile:

May I ask in which part of your pipeline it found its place? 👀

Of course. We use it to tokenize our full text payload/metadata indexes. Saves us a lot of hassle building our own tokenizer. Thank you for your work!

In code: https://github.com/qdrant/qdrant/blob/dev/lib/segment/src/index/field_index/full_text_index/tokenizers.rs

Again, thanks for handling this PR quickly.