Closed ablaette closed 1 year ago
The save_topic_documents() function uses printDocumentTopics() - but the aforementioned method is much, much more efficient!
save_topic_documents()
printDocumentTopics()
Sys.setenv(MALLET_DIR="/opt/mallet/Mallet-202108") library(biglda) library(polmineR) use("polmineR") speeches <- polmineR::as.speeches("GERMAPARLMINI", s_attribute_name = "speaker", s_attribute_date = "date") instance_list <- as.instance_list(speeches) BTM <- BigTopicModel(n_topics = 25L, alpha_sum = 5.1, beta = 0.1) BTM$addInstances(instance_list) BTM$estimate() file <- rJava::.jnew("java/io/File", path.expand("~/Lab/tmp/dense.tsv")) file_writer <- rJava::.jnew("java/io/FileWriter", file) print_writer <- rJava::new(rJava::J("java/io/PrintWriter"), file_writer) BTM$printDenseDocumentTopics(print_writer) print_writer$close() file <- rJava::.jnew("java/io/File", path.expand("~/Lab/tmp/notdense.tsv")) file_writer <- rJava::.jnew("java/io/FileWriter", file) print_writer <- rJava::new(rJava::J("java/io/PrintWriter"), file_writer) BTM$printDocumentTopics(print_writer) print_writer$close() a <- data.table::fread("~/Lab/tmp/dense.tsv") b <- data.table::fread("~/Lab/tmp/notdense.tsv")
save_document_topics() has started to use $printDenseDocumentTopics(). More efficient indeed.
save_document_topics()
$printDenseDocumentTopics()
The
save_topic_documents()
function usesprintDocumentTopics()
- but the aforementioned method is much, much more efficient!