quanteda / quanteda

An R package for the Quantitative Analysis of Textual Data
https://quanteda.io
GNU General Public License v3.0
843 stars 189 forks source link

Function dfm_stem() does not exist but is required to replace dfm(stem) #2403

Closed jugdemon closed 3 months ago

jugdemon commented 3 months ago

Describe the bug

I used dfm(stem) and it suggests to use dfm_stem() but there is no function dfm_stem.

Reproducible code

Please paste minimal code that reproduces the bug. If possible, please upload the data file as .rds.

dfm(corpus, stem=TRUE)

Results in:

Error:
! The `stem` argument of `dfm()` was deprecated in quanteda 3.0 and is now defunct.
ℹ Please use `dfm_stem()` instead.

When I try:

dfm_stem(corpus, stem=TRUE)

it results in:

Error: 'dfm_stem' is not an exported object from 'namespace:quanteda'

Expected behavior

There should be a function dfm_stem but there isn't.

 System information

Please run sessionInfo() and paste the output.

R version 4.4.1 (2024-06-14)
Platform: x86_64-apple-darwin20
Running under: macOS Sonoma 14.5

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] berryFunctions_1.22.5 irr_0.84.1            lpSolve_5.6.20        igraph_2.0.3          quanteda_4.0.2       
 [6] xtable_1.8-4          stargazer_5.2.3       lme4_1.1-35.5         Matrix_1.7-0          stm_1.3.7            
[11] pacman_0.5.1          ggridges_0.5.6        scales_1.3.0          maps_3.4.2            data.table_1.15.4    
[16] rjson_0.2.21          SnowballC_0.7.1       tm_0.7-13             NLP_0.2-1             wordcloud2_0.2.1     
[21] wordcloud_2.6         RColorBrewer_1.1-3    lubridate_1.9.3       forcats_1.0.0         stringr_1.5.1        
[26] dplyr_1.1.4           purrr_1.0.2           readr_2.1.5           tidyr_1.3.1           tibble_3.2.1         
[31] ggplot2_3.5.1         tidyverse_2.0.0       knitr_1.47           

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.1    viridisLite_0.4.2   farver_2.1.2        fastmap_1.2.0       digest_0.6.36       timechange_0.3.0   
 [7] lifecycle_1.0.4     yardstick_1.3.1     magrittr_2.0.3      compiler_4.4.1      rlang_1.1.4         sass_0.4.9         
[13] tools_4.4.1         utf8_1.2.4          yaml_2.3.8          labeling_0.4.3      stopwords_2.3       htmlwidgets_1.6.4  
[19] bit_4.0.5           xml2_1.3.6          pkgload_1.4.0       abind_1.4-5         withr_3.0.0         grid_4.4.1         
[25] fansi_1.0.6         colorspace_2.1-0    MASS_7.3-61         cli_3.6.3           rmarkdown_2.27      crayon_1.5.3       
[31] generics_0.1.3      rstudioapi_0.16.0   httr_1.4.7          tzdb_0.4.0          minqa_1.2.7         cachem_1.1.0       
[37] splines_4.4.1       parallel_4.4.1      vctrs_0.6.5         boot_1.3-30         jsonlite_1.8.8      slam_0.1-51        
[43] ISOcodes_2024.02.12 hms_1.1.3           bit64_4.0.5         jquerylib_0.1.4     glue_1.7.0          nloptr_2.1.1       
[49] stringi_1.8.4       gtable_0.3.5        munsell_0.5.1       pillar_1.9.0        htmltools_0.5.8.1   R6_2.5.1           
[55] vroom_1.6.5         evaluate_0.24.0     lattice_0.22-6      bslib_0.7.0         Rcpp_1.0.12         fastmatch_1.1-4    
[61] nlme_3.1-165        mgcv_1.9-1          xfun_0.45           pkgconfig_2.0.3  

Additional info

Please add any other information about the issue.

koheiw commented 3 months ago

True, it is dfm_wordstem(). You cannot construct a dfm directly from a corpus. Please tokenize first.

dfm_wordstem(dfm(tokens(corpus))stem=TRUE)