Closed hwsamuel closed 6 years ago
To use n-grams, you need to add "Bow" to your parameters when calling gen_regions, which means that the region vector (the internal representation of text regions) will be bag of n-grams.
Three types of "region vector" (the internal representation of text regions) -- Concatenation of word one-hot vectors (sequential), bag-of-word, and bag-of-n-gram. No support for concatenation of n-gram one-hot vectors. This info is on page 4 of http://riejohnson.com/software/conText-v4-ug.pdf .
I get the following error after generating the vocabulary with
n=3
and then generating regions. If I leaven
to the default value of 1, then this error doesn't show, but does that mean trigrams aren't supported? Is this an architecture-specific error?!Input error!: (Detected in AzPrepText::gen_regions)NO SUPPORT for n-gram sequential