riejohnson / ConText

ConText v4: Neural networks for text categorization
http://riejohnson.com/cnn_download.html
GNU General Public License v3.0
124 stars 14 forks source link

No support for n-gram sequential #2

Closed hwsamuel closed 6 years ago

hwsamuel commented 6 years ago

I get the following error after generating the vocabulary with n=3 and then generating regions. If I leave n to the default value of 1, then this error doesn't show, but does that mean trigrams aren't supported? Is this an architecture-specific error?

!Input error!: (Detected in AzPrepText::gen_regions)NO SUPPORT for n-gram sequential

riejohnson commented 6 years ago

To use n-grams, you need to add "Bow" to your parameters when calling gen_regions, which means that the region vector (the internal representation of text regions) will be bag of n-grams.
Three types of "region vector" (the internal representation of text regions) -- Concatenation of word one-hot vectors (sequential), bag-of-word, and bag-of-n-gram. No support for concatenation of n-gram one-hot vectors. This info is on page 4 of http://riejohnson.com/software/conText-v4-ug.pdf .