character-ngrams Search Results

425 results
for character-ngrams

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

quanteda/quanteda #882

tokens operations in dfm() are partly disabled

Part of this has to do with #690, since if we always call `tokens()` on `x` inside `dfm.tokens(x)`, it will perform these operations. First, we don't currently remove elements before forming bigram…

kbenoit updated 7 years ago
15
sailfish-keyboard/sailfishos-presage-predictor #16

General discussion

I am opening this issue for general discussion and updates/notes. Please feel free to close it at any time or move it to some other channel, as it is the best. Right now, we have been using 'English k…

rinigus updated 6 years ago
64
RasaHQ/rasa #120

Word index entity extraction

It would be helpful to use word indexes instead on start and end indexes. This is because parser expectations are too strict when using indexes. Using a word index would give way more flexibility and …

claytantor updated 7 years ago
6
quanteda/quanteda #116

Applying thesaurus / dictionary to n-grams (with n > 1)

E-mail conversation refers. When building a DFM with n-grams (rather than unigrams), the option to apply a thesaurus or dictionary fails because there is no match between an n-gram and dictionary key…

LucFrachon updated 7 years ago
4
jiemakel/arpa #3

Whitespace in the beginning of the processed text is not tri…

The string " foo bar" yields the ngrams [ " foo", "bar", " foo bar" ], whereas the expected ngrams would be [ "foo", "bar", "foo bar" ]. Also, multiple whitespace characters at the beginning of the st…

evsheino updated 7 years ago
1
keras-team/keras #3496

`RuntimeWarning: divide by zero encountered in log` in `lstm…

I trained the model but I encountered a problem in line 65 because some values underflow in the log. Note that after the problem is encountered, I only obtain nonsensical text (more like a collection …

hdmetor updated 7 years ago
12
TeamHG-Memex/html-text #1

whitespace issues

it appears that `.xpath('normalize-space()')` does not deal with whitespace in an ideal way in all cases. Examples: - `ATelephone ` => `ATelephone` - `Phone1-855-445-9710` => `Phone1-855-445-9710…

codinguncut updated 7 years ago
4
ropensci/tokenizers #47

can we use alternative lexicons?

Hello again, I am wondering if `tokenizers` can use user-provided lexicons to `tokenize` a document. Something similar to http://tidytextmining.com/sentiment.html where one can use either the `…

randomgambit updated 7 years ago
4
ropensci/tokenizers #33

NA support?

It should be possible to have the tokenisers and ngram generators: 1. Return NA when an NA is passed in; 2. Ignore NAs in input stopword lists; 3. Return NAs for empty output vectors. Which of…

Ironholds updated 7 years ago
4
quanteda/quanteda #586

Error: could not find function "as.character.tokens"

When trying to convert a tokens object to a character object via `as.character.tokens` I get the following `Error: could not find function "as.character.tokens" ` My code: ``` myCorpus sessionI…

Ninoninoninonino updated 7 years ago
3

上一页 1...33 34 35 36 37 38 39...43 下一页

425 results for character-ngrams

425 results
for character-ngrams