amplab / training

Training materials for Strata, AMP Camp, etc
150 stars 121 forks source link

extractNGrams: explanation? #96

Open jowens opened 10 years ago

jowens commented 10 years ago
NGrams.extractNGrams(taggedInputTable, c=1, n=2, k=1000, stopWords = NGrams.stopWords)

Might be nice to say what c, n, and k are. I'm presuming n is the n in n-gram and k is the number of feature vectors.

atalwalkar commented 10 years ago

Good point. "c" is the column on which we want to perform n-gram extraction, and as you mentioned "n" is the number of n-grams while "k" is the number of features (n-grams) we want to use.

On Fri, Aug 30, 2013 at 11:38 AM, John Owens notifications@github.comwrote:

NGrams.extractNGrams(taggedInputTable, c=1, n=2, k=1000, stopWords = NGrams.stopWords)

Might be nice to say what c, n, and k are. I'm presuming n is the n in n-gram and k is the number of feature vectors.

— Reply to this email directly or view it on GitHubhttps://github.com/amplab/training/issues/96 .