Closed WeiwenXu21 closed 6 years ago
RDD([ ('label1', [('w1',c1), ('w2',c2), ('w3',c3), ...]), ('label2', [('w1',c1), ('w2',c2), ('w3',c3), ...]), ...]) ----> This one is good for Naive Bayes but hard to get into [('w1',c1), ('w2',c2), ('w3',c3), ...] efficiently or RDD([ ('w1', [('label1',c1), ('label2',c2), ('label3',c3), ...]), ('w2', [('label1',c1), ('label2',c2), ('label3',c3), ...]), ...]) or RDD([ ('label1', [c1, c2, c3, ...]), ('label2', [c1, c2, c3, ...]), ...]) or RDD([ ('w1', [c1, c2, c3, ...]), ('w2', [c1, c2, c3, ...]), ...])
The first one will be best for detecting the words in testing! Please!
sorted!
RDD([ ('label1', [('w1',c1), ('w2',c2), ('w3',c3), ...]), ('label2', [('w1',c1), ('w2',c2), ('w3',c3), ...]), ...]) ----> This one is good for Naive Bayes but hard to get into [('w1',c1), ('w2',c2), ('w3',c3), ...] efficiently or RDD([ ('w1', [('label1',c1), ('label2',c2), ('label3',c3), ...]), ('w2', [('label1',c1), ('label2',c2), ('label3',c3), ...]), ...]) or RDD([ ('label1', [c1, c2, c3, ...]), ('label2', [c1, c2, c3, ...]), ...]) or RDD([ ('w1', [c1, c2, c3, ...]), ('w2', [c1, c2, c3, ...]), ...])