kbenoit / LIWCalike

R package to extend quanteda to mimic LIWC
36 stars 6 forks source link

WPS double-column crashes write.csv and alike #13

Closed sbor23 closed 6 years ago

sbor23 commented 6 years ago

We found that liwcalike produces a data.frame that cannot be converted to a matrix in a straight-forward fashion and thus breaks the export of csv files. This is because the function creates the WPS column, which itself contains 2 lists (document, meanSentenceLength).

str(dict_analysis)

'data.frame': 1804 obs. of 28 variables: $ docname : chr "text1" "text2" "text3" "text4" ... $ Segment : int 1 2 3 4 5 6 7 8 9 10 ... $ WC : int 1 1 1 5 1 5 5 1 5 5 ... $ WPS :Classes 'readability', 'textstat' and 'data.frame': 1804 obs. of 2 variables: ..$ document : chr "text1" "text2" "text3" "text4" ... ..$ meanSentenceLength: num 1 1 1 5 1 5 5 1 5 5 ... $ Sixltr :Class 'AsIs' chr [1:1804] "100.000" "100.000" "100.000" "40.000" ... [...]

Trying to export it leads to a crash in as.matrix.data.frame: https://github.com/wch/r-source/blob/af7f52f70101960861e5d995d3a4bec010bc89e6/src/library/base/R/dataframe.R#L1468

temp <- as.matrix(dict_analysis[4])

Error in ncol(xj) : object 'xj' not found

Similar problems exist with simple dplyr usage:

min_ <- dict_analysis %>% 
  filter(X_id %in% c(2315968))

Error: Column WPS must be a 1d atomic vector or a list

kbenoit commented 6 years ago

LIWCalike is deprecated; please try installing quanteda.dictionaries instead, where you will find a liwcalike() function that works.

sbor23 commented 6 years ago

Wow ok that's unexpected news. Could you please put a disclaimer in the Readme.md and the project description then? We could have totally avoided bothering with LIWCalike in the first place if we knew...

Anyway, thanks for you work and integrating it into quanteda!