trinker / qdap

Quantitative Discourse Analysis Package: Bridging the gap between qualitative data and quantitative analysis
http://cran.us.r-project.org/web/packages/qdap/index.html
175 stars 43 forks source link

syllable_sum error message: Subset out of bounds #262

Open schaubst opened 3 years ago

schaubst commented 3 years ago

Dear all,

I would like to use qdap::syllable_sum to count the syllables in a large dataset (>450.000 rows). The text data whose syllables I want to count are stored in the column of a tibble. I want to count the syllables for each row and store the sum in a separate column.

Syllable_sum works fine with expected input, but throws an error message when it encounters unexpected input, e.g. "____" or "12345". Instead of storing an NA value and moving on, the process stops and prints the error message "Subset out of bounds".

Is there a way to force the function to spit out an NA for those rows where the function cannot produce a syllable sum?

Here's a reproducible example with some 'good' and 'bad' input. I would like the function to create a new column syls_y with the sums (3 and 4) for the first two rows, and NA for the last three.

Thank you in advance for your tips!

`

qdap syllable separation reprex

Load packages

library(tidyverse) library(qdap)

Create data frame

data <- tibble( x = 1:5, y = c("A few words", "a few more words", "____", "1235", "+#$") )

Syllable separation of each row in data$y

data_syls = data %>% mutate(syls_y = qdap::syllable_sum(data$y))`

schaubst commented 3 years ago

I still need help with this. Any tipp, recommendation or response would be greatly appreciated. Thank you!