Closed dianegal closed 7 years ago
Could you show me your script?
My script is as follows:
library(topicmodels) library(tm) library(slam) library(MASS) library(data.table) library(mallet)
library(readr) library(tidytext) library(stringr) library(dplyr)
memory.limit(100000000) options(java.parameters = "-Xmx30000m")
documents<-fread("2013_clean.csv", colClasses=c(rep("character",2))) colnames(documents)<-c("id", "text") documents<-as.data.frame(documents)
by_word <- documents %>% unnest_tokens(word, text)
collapsed <- by_word %>% anti_join(stop_words, by = "word") %>% mutate(word = str_replace(word, "'", "")) %>% group_by(id) %>% summarize(text = paste(word, collapse = " "))
file.create(empty_file <- tempfile()) docs <- mallet.import(collapsed$id, collapsed$text, empty_file)
num.topics=50
topic.model <- MalletLDA(num.topics=num.topics)
topic.model$loadDocuments(mallet.instances)
vocabulary <- topic.model$getVocabulary()
word.freqs <- mallet.word.freqs(topic.model)
topic.model$train(10)
tidy(topic.model)
tidy(topic.model, matrix = "gamma")
Thanks for your time and suggestions, Diane
On 23 May 2017 at 12:20, suyi notifications@github.com wrote:
Could you show me your script?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dgrtwo/tidy-text-mining/issues/31#issuecomment-303355843, or mute the thread https://github.com/notifications/unsubscribe-auth/AbUXcjF33b3FSEZOfYZOWWPw-biWCa-1ks5r8rLZgaJpZM4NjbSz .
Can you give any data? For example the first 100 lines.
I think this is because the mallet tidier in tidytext hasn't been submitted to CRAN yet: could you try installing the dev version from GitHub instead?
devtools::install_github("juliasilge/tidytext")
Here is a sample of the data I am using. I will try the devtools installation shortly too.
On 23 May 2017 at 14:12, suyi notifications@github.com wrote:
Can you give any data? For example the first 100 lines.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dgrtwo/tidy-text-mining/issues/31#issuecomment-303378694, or mute the thread https://github.com/notifications/unsubscribe-auth/AbUXciTBaL2QNFpxswd8Q1xPnzPP2Y0iks5r8s0xgaJpZM4NjbSz .
Thanks, yes installing via github did help resolve the original issue.
However now a new error has come up when I run the tidy(topic.model) command stating:
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, : java.lang.NullPointerException
Would you have any suggestions for this error? Thanks, Diane
On 23 May 2017 at 15:20, David Robinson notifications@github.com wrote:
I think this is because the mallet tidier in tidytext hasn't been submitted to CRAN yet: could you try installing the dev version from GitHub instead?
devtools::install_github("juliasilge/tidytext")
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dgrtwo/tidy-text-mining/issues/31#issuecomment-303395261, or mute the thread https://github.com/notifications/unsubscribe-auth/AbUXcr0kfIL1xKb4Dgv1v4WqsqZuSydRks5r8t0RgaJpZM4NjbSz .
That error looks like Java is trying to open something but can't find it. Is it your list of stop words maybe? You can check out this Stack Overflow question for perhaps some guidance.
hi, when I try the code mention in topic modeling part. I got a error mention like this. Error in LDA(dtm, k, method = "Gibbs", control = list(nstart = nstart, : Each row of the input matrix needs to contain at least one non-zero entry Please help me to fix it. Thank you.
Answered @chankamiperera in issue #32.
I believe everyone here has had their questions answered, so I'm closing this issue.
Thank you very much for this useful book and examples. I have been applying the code to my own set of data but each time I try to obtain the data from the mallet topic.model it gives an error as follows:
Error in as.data.frame.default(x) : cannot coerce class "structure("jobjRef", package = "rJava")" to a data.frame In addition: Warning message: In tidy.default(topic.model) : No method for tidying an S3 object of class jobjRef , using as.data.frame
Would you have any suggestions on how to fix this issue? Thanks