character-ngrams Search Results

425 results
for character-ngrams

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

quanteda/quanteda #1590

Deprecate $ for tokens

In preparation for v2.0, we will deprecate `[[]]` and `$` operators for tokens

koheiw updated 5 years ago
10
piskvorky/gensim #814

Loading fastText binary output to gensim like word2vec

Facebook's recent open sourced `fasttext` https://github.com/facebookresearch/fastText improves the `word2vec` SkipGram model. It follows a similar output format for `word` - `vector` key value pairs,…

phunterlau updated 5 years ago
49
snorkel-team/snorkel #1008

Character-level parsing?

Hi, I am trying to use Snorkel for my problem settting where I need character-level mappings. What I want is every character being a candidate while keeping Ngrams for its context. For example, If…

jbkoh updated 5 years ago
1
piskvorky/gensim #2415

Inference issue using FB pretrained model if word have no ng…

**Problem:** `FastText` in gensim and official version still produce different output on FB pretrained model (issue with oov word **without ngrams**). **Prepare data:** ```bash curl https://dl.…

menshikh-iv updated 5 years ago
11
piskvorky/gensim #2059

fasttext ft_hash and unicode handling

Fasttext uses the hashing trick to map ngrams to a an index in [0, N]. Gensim supports loading models trained with original fasttext implementation from facebook research. It is therefore important th…

leezu updated 5 years ago
10
piskvorky/gensim #310

Handling unseen words in the word2vec/doc2vec model

Hi there, Lets take a case where we are training a corpus that doesn't contain a given word (say "foo"). If this word shows up in an as yet unknown test statement - you generally see a keyError for …

viksit updated 5 years ago
11
facebookresearch/fastText #719

How fasttext predicts the words whose substrings are not in …

Hi guys, I am just confused about that when I trained my fasttext model on English corpus and then out of curiosity, I exploited it to predict Chinese, and I also got a word embedding, I know we can g…

DevinWang23 updated 5 years ago
1
piskvorky/gensim #1261

Improve FastText loading times

Consider reading just the `bin` file as sugested in https://github.com/RaRe-Technologies/gensim/issues/814#issuecomment-289464725 Compare to C++ code in https://github.com/salestock/fastText.py/blo…

tmylk updated 5 years ago
33
RasaHQ/rasa #1217

featurizer_count_vectors with words and N-Grams?

Hey, I need to use N-Grams with featurizer_count_vectors , but I think it would improve importantly the classification if you could also use it together with word itself. Such that not trained word…

ctrado18 updated 5 years ago
6
youngcc3157/RedM_Text_Transformation #34

Remove Punctuation/Unicode from Parsed Output

Some unicode characters are sneaking into our ngrams. We need to find out why and get rid of them. When this is done, we will need to regenerate the unit test files. ```python3 testing_framework.…

emaicus updated 6 years ago
1

上一页 1...26 27 28 29 30 31 32...43 下一页

425 results for character-ngrams

425 results
for character-ngrams