bstewart / stm

An R Package for the Structural Topic Model
Other
401 stars 98 forks source link

R crashes when STM model converge #89

Closed adeldaoud closed 7 years ago

adeldaoud commented 7 years ago

Relatively frequently R crashes when STM model converge. See image below for one example after a 30h+ estimation session. This has happened on two different computers, with different data sizes. I have not been able to identify any specific patterns leading to these crashes—as the crashes do not seem to be deterministic.

The model estimation settings is usually:

# full
Year <- year(df$date) # year data from the enviroment

stmFit.full <- stm(out$documents, out$vocab, K = 0, prevalence =~ s(Year) , 
              max.em.its = 150, init.type = "Spectral", seed = 300, verbose = T)

Any ideas what is going on?

image

adeldaoud commented 7 years ago

I have now made sure that R, Rstudio, and all packages are up to date, but the model still crashes. Happy to share data if it helps to identify the problems.

bstewart commented 7 years ago

For anyone else following this issue- we've learned a lot more. It seems to be specific to Windows machines and particularly problematic with ngroups>1. We are still investigating but others with this problem, please feel free to share your experiences here.

bstewart commented 7 years ago

We've continued to pursue this and it seems to be a bug with matrixStats on large matrices and only on Windows.

bstewart commented 7 years ago

Okay- this is now fixed in the version in the master branch.

Huge thanks to @adeldaoud for both finding the bug initially but also doing a ton of testing to help narrow down the cause.

cschwem2er commented 7 years ago

Hi Brandon,

will these fixes hit CRAN before mid October? I'd love to use the latest stm version for teaching a text analysis class.

bstewart commented 7 years ago

Definitely. I should have a submission in the CRAN queue by Friday actually, we are just making some tweaks to the vignette.