character-ngrams Search Results

425 results
for character-ngrams

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

nltk/nltk #2992

3.7: pytest is failing

I'm trying to package your module as an rpm package. So I'm using the typical PEP517 based build, install and test cycle used on building packages from non-root account. - `python3 -sBm build -w --no…

kloczek updated 1 year ago
11
wooorm/franc #100

Improved accuracy for small documents

I'd like to play with patching franc, or making some alternative to it, that can detect the language of small documents much more accurately. First of all is this something that could be interestin…

fabiospampinato updated 1 year ago
19
ThomasFaria/retex-innovation-insee #3

Remarks

- Intro: Insee -> les gens ne savent pas forcément ce que c'est ? - 2.1: "interestingly, subsequent projects involving large datasets didn’t suffer much from this change, as their needs were actually…

tomseimandi updated 7 months ago
3
meilisearch/meilisearch #4636

matching strategy "all" is not matching expected hits

**Describe the bug** When using matching strategy 'all', I expect that documents where all search terms of the query match at least one searchable attribute are considered as hits. However, it seems…

mdostmann updated 5 months ago
4
MaartenGr/BERTopic #90

About Coherence of topic models

Currently, I am calculating the Coherence of a bertopic model using the gensim. For this I need the n_grams from each text of the corpus. Is it possible? The function used by gensim waits for the corp…

nadiafelix updated 5 months ago
79
rapidsai/cudf #12806

[BUG] Series.str.character_ngrams(as_list=True) resets index…

**Describe the bug** `Series.str.character_ngrams(as_list=True)` resets index when it shouldn't **Steps/Code to reproduce bug** Consider the following code: ``` import cudf df = cudf.DataFrame…

daxiongshu updated 9 months ago
1
paradedb/paradedb #295

Enable/fix custom tokenizers (CJK + ngrams)

Updated Issue: https://github.com/paradedb/paradedb/pull/276 implements a CJK tokenizer, but it doesn't seem to be working. To replicate: ```sql CREATE TABLE tokenizer_config AS SELECT * FROM p…

ghost updated 7 months ago
1
piskvorky/gensim #3068

After model normalization wmdistance() isn't working (self t…

#### Problem description I want to calculate the Word Mover's Distance. After the normalization (`model.init_sims(replace=True)`) of my self made fastText model, the `wmdistance()` function isn't wor…

mattkoehne updated 3 years ago
10
streetcomplete/StreetComplete #5671

Text used for quest_generic_otherAnswers2 can be mistaken fo…

Quests that use the `quest_generic_otherAnswers2` display answers such as "UH…", "ER…" "OH…". To new users, it is not always obvious that these are actually words and furthermore can sometimes be mist…

bompstable updated 5 months ago
17
elastic/elasticsearch #61435

Allow exact matching on wildcard types w/o asterisk

The new wildcard data type forces the user to construct queries containing the wildcard '*' character, even when no expansion is desired. This feels counter-intuitive, especially since the wildcard ty…

aleph-zero updated 4 months ago
6

上一页 1...9 10 11 12 13 14 15...43 下一页

425 results for character-ngrams

425 results
for character-ngrams