issues
search
meilisearch
/
charabia
Library used by Meilisearch to tokenize queries and documents
MIT License
261
stars
89
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
`into_tokenizer` don't work with stop_words
#319
ManyTheFish
opened
4 hours ago
0
Bump Swatinem/rust-cache from 2.7.3 to 2.7.5
#318
dependabot[bot]
closed
2 weeks ago
1
Latin camelcase wrong segmentation
#317
PedroTurik
closed
2 weeks ago
1
Normalization Issue for Turkish Characters in Charabia
#316
niyazialpay
closed
3 weeks ago
3
Replace jemalloc(ator) with mimalloc, which covers wider platforms
#315
tats-u
opened
1 month ago
3
charabia can't be built in Windows due to jemallocator used only for bench
#314
tats-u
opened
1 month ago
1
Update wana_kana to 4.0.0
#313
tats-u
closed
1 month ago
0
Update wana_kana to 4.0.0
#312
tats-u
closed
1 month ago
3
fix: Segment number into word instead of chars (#271)
#311
dqkqd
closed
1 month ago
3
Bump peter-evans/create-pull-request from 6 to 7
#310
dependabot[bot]
closed
1 month ago
1
Mutualize char normalizers
#309
ManyTheFish
opened
2 months ago
0
Prepare v0.9.1
#308
ManyTheFish
closed
2 months ago
1
Update version for the next release (v0.9.1) in Cargo.toml files
#307
meili-bot
closed
2 months ago
0
German: Adds some more test cases and updates dictionary
#306
luflow
closed
2 months ago
3
Add Turkish normalizer
#305
tkhshtsh0917
closed
3 months ago
3
Persian language support for normalization and segmentation
#304
Ja7ad
opened
3 months ago
6
feat: Adds German compound words decomposition with new segmenter
#303
luflow
closed
2 months ago
12
Update version for the next release (v0.9.0) in Cargo.toml files
#302
meili-bot
closed
4 months ago
0
Add math symbols to default separators
#301
phillitrOSU
closed
4 months ago
1
Add Math symbols in the default separator list
#300
ManyTheFish
closed
4 months ago
0
Simplify lang detection
#299
ManyTheFish
closed
4 months ago
1
update internal dependencies for release
#298
irevoire
closed
4 months ago
1
Update dependencies
#297
irevoire
closed
4 months ago
1
Normalizer for russian
#296
aignatovich
opened
5 months ago
9
Add null byte as hard context separator
#295
LukasKalbertodt
closed
4 months ago
2
Normalization Issue for Turkish Characters in Charabia
#294
niyazialpay
closed
3 months ago
7
Update version for the next release (v0.8.11) in Cargo.toml files
#293
meili-bot
closed
5 months ago
2
Upgrade Lindera to 0.31.0
#292
mosuka
closed
6 months ago
3
fix: fixed `chinese-normalization-pinyin` feature test failed
#291
tkhshtsh0917
closed
5 months ago
3
The `chinese-normalization-pinyin` feature flag doesn't compile
#290
ManyTheFish
closed
5 months ago
6
latin-camelcase feature make wrong segmentation
#289
hamano
closed
2 weeks ago
11
Update version for the next release (v0.8.10) in Cargo.toml files
#288
meili-bot
closed
6 months ago
0
Add swedish recomposition normalizer and link it to a feature
#287
ManyTheFish
closed
6 months ago
1
Update bors.toml with missing tests
#286
curquiza
closed
7 months ago
1
Rework Chinese Pinyin normalizer
#285
ManyTheFish
opened
7 months ago
0
Update README.md
#284
ManyTheFish
closed
7 months ago
1
Update version for the next release (v0.8.9) in Cargo.toml files
#283
meili-bot
closed
7 months ago
1
Make the pinyin-normalization optional
#282
ManyTheFish
closed
7 months ago
1
Fix char boundary panic
#281
ManyTheFish
closed
7 months ago
2
Add `\t` as recognized separator
#280
Gusted
closed
7 months ago
1
Update Lindera to 0.30.0
#279
mosuka
closed
7 months ago
1
Adds a new normalizer to normalize œ to oe and æ to ae
#278
Soham1803
closed
6 months ago
10
Update version for the next release (v0.8.8) in Cargo.toml files
#277
meili-bot
closed
8 months ago
4
Tag and release new version?
#276
6543
closed
8 months ago
1
Support markdown formatted codeblocks
#275
6543
closed
8 months ago
3
Bump release-drafter/release-drafter from 5 to 6
#274
dependabot[bot]
closed
8 months ago
1
Update Lindera to 0.28.0
#273
mosuka
closed
9 months ago
1
[Maintainance] Review and amend documentation in files
#272
ManyTheFish
opened
9 months ago
0
Numbers are not segmented the same way depending on the Script/Language
#271
ManyTheFish
closed
1 month ago
13
Vietnamese: Add laking tests and fix bug
#270
ManyTheFish
closed
9 months ago
2
Next