character-ngrams Search Results

425 results
for character-ngrams

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

RasaHQ/rasa #1181

tensorflow intent classification is very sensititive! What t…

Hey, I use for german language the tensorflow embedding. 1. I recognize a high sensitivity of the intent classifictaion with just slightly changes of sentences like adding just one whitespace be…

ctrado18 updated 6 years ago
22
standard/standard #10

standard catching error in standard

This might have something to do with the installed toolset on my machine, but i tried running standard after installing globally (`npm install -g standard`), and I am getting an error thrown on what a…

keithhamilton updated 6 years ago
3
piskvorky/gensim #1642

fastText models from 2.3.0 can't be loaded in 3.0.0

#### Description I do have a compatibility issue with fastText and version 3.0.0. In version 2.3.0, I used the fastText C++ wrapper to train a model based on the code available at that time from ht…

Liebeck updated 7 years ago
6
quanteda/quanteda #1055

Summarizing and subtotaling tokens

What's the best way to create a summarized subtotal of tokens? What I'm trying to do is take a text, tokenize large multi-word strings, then count up the frequencies of those strings. ``` token…

cspenn updated 7 years ago
3
tesseract-ocr/tesseract #871

Add man pages for new programs

No man pages are there for the following programs in https://github.com/tesseract-ocr/tesseract/tree/master/doc * classifier_tester * lstmeval * lstmtraining * set_unicharset_properties * tex…

Shreeshrii updated 6 years ago
6
kpu/kenlm #119

Need suggestion : Input file format for training

What should be input file format for training. I am having one sentence per line in a file with "\n" at the end of each line and training command looks like "**bin/lmplz -o 4 dummy.arpa**" Does new…

manishbansal-fk updated 6 years ago
6
snorkel-team/snorkel #838

How to Named Entity Recognize using Data Programming in Snor…

Hi, My purpose is extracting two entities(**Industry** and **Company**) in every Chinese raw text(or sentence), and each entity including few Chinese Characters. Modeling strategy is **LSTM + CRF**…

wenfeixiang1991 updated 6 years ago
3
ropensci/tokenizers #25

integration into quanteda as a core tokenizer

I started to continue our comments on #24 but thought it best to start a new issue. As for **quanteda**, we are thinking of an rOpenSci-type overhaul of the API that would be a major change. (The cle…

kbenoit updated 6 years ago
14
pytorch/text #162

Subword-level tokenization error while building vocab

I want to analyze imdb dataset in subword (character) level. so i tried following; ``` TEXT = data.SubwordField(fix_length=100) LABEL = data.Field(sequential=False) train, test = datasets.IMDB.spl…

keon updated 7 years ago
2
quanteda/quanteda #740

tokens_select not selecting ngrams

I thought we had addressed this already, but maybe this is part of #719. ### for tokens Define two sets of tokens, simple unigrams and space-separated bigrams: ```r (toks

kbenoit updated 7 years ago
12

上一页 1...30 31 32 33 34 35 36...43 下一页

425 results for character-ngrams

425 results
for character-ngrams