ltflores / csc-869-mlog

Automatically exported from code.google.com/p/csc-869-mlog
0 stars 0 forks source link

Play around with different settings of the StringToWordVector #4

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Play around with different settings of the StringToWordVector:

- Different tokenizer
- Different stemmer
- Different stopword list or no stopword list
- Different min. word freq.
- Different num. of words to keep
- Different pruning
...

Document how each of those things influence the results.

Original issue reported on code.google.com by markus.neubrand on 5 Apr 2011 at 11:58

GoogleCodeExporter commented 9 years ago

Original comment by markus.neubrand on 6 Apr 2011 at 12:06

GoogleCodeExporter commented 9 years ago

Original comment by markus.neubrand on 6 Apr 2011 at 12:10