Closed odelmarcelle closed 1 year ago
Thank you for this. It looks like a floating point precision problem.
> dict1 <- dictionary(list(people = c("family", "couple", "kids"),
+ space = c("alien", "planet", "space"),
+ moster = c("monster*", "ghost*", "zombie*"),
+ war = c("war", "soldier*", "tanks"),
+ crime = c("crime*", "murder", "killer")
+ ))
> slda1 <- textmodel_seededlda(dfmt * 100, dict1, max_iter = 100)
> slda2 <- textmodel_seededlda(dfmt, dict1, max_iter = 100)
> rowSums(slda1$phi)
people space moster war crime
1 1 1 1 1
> rowSums(slda2$phi)
people space moster war crime
1.0000000 1.0000000 0.9998885 1.0000000 1.0000000
No, actually rounding problem in the Array object. I will fix it.
@odelmarcelle can you check if #61 fixed the problem?
Yes, that solves it for me :smiley: Thanks for the quick response.
@koheiw Do you plan to release a new version on CRAN soon? This actually causes some tests to fail in a package I created (https://cran.r-project.org/web/checks/check_results_sentopics.html).
I actually submitted it to the CRAN yesterday.
Hello,
I observed a strange behavior when applying the seededLDA model: the topic-word distribution does not always sums to one.
I recently updated the package and I don't remember having this issue before (might be fairly old though).
Created on 2023-06-02 with reprex v2.0.2
Session info
``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.0 (2023-04-21 ucrt) #> os Windows 10 x64 (build 19044) #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate French_Belgium.utf8 #> ctype French_Belgium.utf8 #> tz Europe/Paris #> date 2023-06-02 #> pandoc 2.19.2 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> ! package * version date (UTC) lib source #> cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0) #> digest 0.6.31 2022-12-11 [1] CRAN (R 4.3.0) #> evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) #> fastmatch 1.1-3 2021-07-23 [1] CRAN (R 4.3.0) #> fs 1.6.2 2023-04-25 [1] CRAN (R 4.3.0) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0) #> htmltools 0.5.5 2023-03-23 [1] CRAN (R 4.3.0) #> knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0) #> lattice 0.21-8 2023-04-05 [2] CRAN (R 4.3.0) #> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) #> Matrix 1.5-4 2023-04-04 [2] CRAN (R 4.3.0) #> proxyC * 0.3.3 2022-10-06 [1] CRAN (R 4.3.0) #> purrr 1.0.1 2023-01-10 [1] CRAN (R 4.3.0) #> quanteda * 3.3.1 2023-05-18 [1] CRAN (R 4.3.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.3.0) #> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.3.0) #> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.3.0) #> Rcpp 1.0.10 2023-01-22 [1] CRAN (R 4.3.0) #> D RcppParallel 5.1.7 2023-02-27 [1] CRAN (R 4.3.0) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.3.0) #> rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0) #> rmarkdown 2.22 2023-06-01 [1] CRAN (R 4.3.0) #> rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.3.0) #> seededlda * 1.0.0 2023-05-31 [1] CRAN (R 4.3.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) #> stopwords 2.3 2021-10-28 [1] CRAN (R 4.3.0) #> stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0) #> styler 1.10.0 2023-05-24 [1] CRAN (R 4.3.0) #> vctrs 0.6.2 2023-04-19 [1] CRAN (R 4.3.0) #> withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0) #> xfun 0.39 2023-04-20 [1] CRAN (R 4.3.0) #> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0) #> #> [1] C:/Users/odlmarce/AppData/Local/R/win-library/4.3 #> [2] C:/Program Files/R/R-4.3.0/library #> #> D ── DLL MD5 mismatch, broken installation. #> #> ────────────────────────────────────────────────────────────────────────────── ```