senderle / topic-modeling-tool

A point-and-click tool for creating and analyzing topic models produced by MALLET.
https://senderle.github.io/topic-modeling-tool/documentation/2017/01/06/quickstart.html
Apache License 2.0
106 stars 22 forks source link

CSV file as a set of documents? #77

Open StephenQuirolgico opened 4 years ago

StephenQuirolgico commented 4 years ago

Is it possible to use a CSV file as the set of input documents (i.e., where each row in the CSV file represents a different document)? We have a dataset containing thousands of documents and it's not practical to have each of these as a separate text file.

senderle commented 4 years ago

This is a feature of MALLET, and it used to be available in the TMT, but it proved difficult to maintain the tool while allowing both modes of input. However, we've done some refactoring since then, and it might be easier now. I've been thinking about this for a while and will look into it — thanks for the suggestion!