I would like to use qdap::syllable_sum to count the syllables in a large dataset (>450.000 rows). The text data whose syllables I want to count are stored in the column of a tibble. I want to count the syllables for each row and store the sum in a separate column.
Syllable_sum works fine with expected input, but throws an error message when it encounters unexpected input, e.g. "____" or "12345". Instead of storing an NA value and moving on, the process stops and prints the error message "Subset out of bounds".
Is there a way to force the function to spit out an NA for those rows where the function cannot produce a syllable sum?
Here's a reproducible example with some 'good' and 'bad' input. I would like the function to create a new column syls_y with the sums (3 and 4) for the first two rows, and NA for the last three.
Thank you in advance for your tips!
`
qdap syllable separation reprex
Load packages
library(tidyverse)
library(qdap)
Create data frame
data <- tibble(
x = 1:5,
y = c("A few words", "a few more words", "____", "1235", "+#$")
)
Syllable separation of each row in data$y
data_syls = data %>%
mutate(syls_y = qdap::syllable_sum(data$y))`
Dear all,
I would like to use qdap::syllable_sum to count the syllables in a large dataset (>450.000 rows). The text data whose syllables I want to count are stored in the column of a tibble. I want to count the syllables for each row and store the sum in a separate column.
Syllable_sum works fine with expected input, but throws an error message when it encounters unexpected input, e.g. "____" or "12345". Instead of storing an NA value and moving on, the process stops and prints the error message "Subset out of bounds".
Is there a way to force the function to spit out an NA for those rows where the function cannot produce a syllable sum?
Here's a reproducible example with some 'good' and 'bad' input. I would like the function to create a new column syls_y with the sums (3 and 4) for the first two rows, and NA for the last three.
Thank you in advance for your tips!
`
qdap syllable separation reprex
Load packages
library(tidyverse) library(qdap)
Create data frame
data <- tibble( x = 1:5, y = c("A few words", "a few more words", "____", "1235", "+#$") )
Syllable separation of each row in data$y
data_syls = data %>% mutate(syls_y = qdap::syllable_sum(data$y))`