character-ngrams Search Results

425 results
for character-ngrams

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

quanteda/quanteda #392

char_ngrams/token_ngrams.character call different C++ ngrams…

`char_ngrams/token_ngrams.character`: calls `skipgramcpp()` `tokens_ngrams.tokens`: calls `qatd_cpp_ngram_mt_list()` Is this intentional? Behaviours are different: see #391.

kbenoit updated 7 years ago
1
quanteda/quanteda #608

Is tokens_ngram leaking memory?

I am developing an application for a Coursera Capstone with the help of the quanteda package and I constantly face two issues mostly when the method tokens_ngram is running: 1) std::bad_alloc on the …

GuiDoSignal updated 7 years ago
16
CLD2Owners/cld2 #53

Language Detection with CLD2 with Mixed Inputs in long docum…

**Internals Recap.** _CLD2 is a Naïve Bayesian classifier, trained on documents of mean size of 200 characters, trained on a corpus of 100M scraped and human expert selected web pages._ When workin…

loretoparisi updated 7 years ago
1
quanteda/quanteda #391

tokens_ngrams(x, n = ...) fails when ntokens(x) < n

```r packageVersion("quanteda") ## [1] ‘0.9.8.9029’ tokens_ngrams(tokens("a"), n = 2) ## Warning: stack imbalance in '.Call', 38 then 39 ## Warning: stack imbalance in '{', 35 then 36 ## Warni…

kbenoit updated 7 years ago
11
ropensci/tokenizers #24

Incorrect skipgrams

I expect skipgrams with k=2 to produce ``` "a b c" "a b d" "a c d" "a c e" "b c d" "b c e" "b d e" "c d e" ``` But I am getting ``` > tokenizers::tokenize_skip_ngrams('a b c d e', n=3, k=2) [[1]] …

koheiw updated 7 years ago
47
quanteda/quanteda #401

Is topfeatures counting correctly?

A few times I've run into counts from topfeatures that seem way off base. In the example below, I get ngrams 2:5 and topfeatures counts 16 occurrences of the ngram, "humor_no_head_games". I then use k…

BobMuenchen updated 7 years ago
2
vseloved/cl-nlp #5

Lispworks issues with chars.lisp

I've found a couple of compatibility issues with the chars.lisp file in src/utils/ and LispWorks the first was in the +WHITE-CHARS+ param, LispWorks uses #\NO-BREAK-SPACE so I did: ``` lisp (defpara…

ELind77 updated 7 years ago
8
BMDSoftware/neji #9

Trained Model Recognizes Nothing

Hi, I just tried to train a model by "./nejiTrain.sh -a example/train/annotations -c example/train/sentences -f example/train/bw_o2_windows.config -if BC2 -m mymodel -o mymodel -t 11". However, …

wangmj17 updated 7 years ago
9
quanteda/quanteda #149

ngrams performance

@adamobeng has proposed a simple R-based ngram former that seems to blow away the C++ code in terms of speed. @koheiw has something gone awry with the C++ code? We should test this and figure out what…

kbenoit updated 7 years ago
12
nltk/nltk #1526

Circular imports between probability, text and utils modules

I am running into circular import problems while preparing a PR to add some functionality to `nltk/text.py` - specifically, I am trying to write a new class `TextFreqDist` in `nltk/probability.py`, wh…

sr-murthy updated 8 years ago
2

上一页 1...34 35 36 37 38 39 40...43 下一页

425 results for character-ngrams

425 results
for character-ngrams