senderle / topic-modeling-tool

A point-and-click tool for creating and analyzing topic models produced by MALLET.
https://senderle.github.io/topic-modeling-tool/documentation/2017/01/06/quickstart.html
Apache License 2.0
106 stars 22 forks source link

Feature/autoseg #37

Closed senderle closed 8 years ago

senderle commented 8 years ago

This pull request implements automatic segmentation. It also adds much better CSV support; escaped quotation marks, delimiters, and newlines are all handled (relatively) gracefully. There still seem to be some encoding issues, but they haven't bothered me too much. I recommend using CSV files not generated by Excel, which seems addicted to strange encodings that break Java's unicode support.

Closes #17.