Closed kmunger closed 6 years ago
When I run the function covars_make_all on hansard speeches, 29 of the 33 measures are returned correctly, but not the 4 measures related to word rarity.
However, when I run covars_make_baselines, these 4 measures work on the same corpus.
setwd("C:/Users/kevin/Dropbox/Benoit_Spirling_Readability/hansard_data/") files<-list.files() ##initialize all_files<-read.csv(paste0(files[2]), stringsAsFactors = F) restricted<-filter(all_files, party == "Conservative" | party == "Labour") speakers<-all_files$speaker tab<-table(speakers) speakers_morethan10 <- names(tab[tab > 10]) restricted <- filter(restricted, speaker %in% speakers_morethan10) restricted<-restricted[which(ntoken(restricted$text)>10),] data_corpus_speeches66 <- corpus(restricted) pos<-covars_make_all(data_corpus_speeches66, dependency=F)` > pos$google_min_2000[100] [1] NA > pos$brown_mean[1000] [1] NA
@kmunger is this still a concern, or just an issue to fix (eventually) in the software?
@kbenoit Not an immediate concern, there's an easy workaround, just something to fix at some point
When I run the function covars_make_all on hansard speeches, 29 of the 33 measures are returned correctly, but not the 4 measures related to word rarity.
However, when I run covars_make_baselines, these 4 measures work on the same corpus.