amrrs / text-analysis-with-udpipe

Text Analysis in R with Udpipe Package
6 stars 4 forks source link

Error in udpipe_annotate #4

Open pba2961 opened 2 years ago

pba2961 commented 2 years ago

Hi my code :

pacman::p_load(dplyr, ggplot2, stringr, udpipe, lattice, tidytext, wordcloud2)

gnewsheadlines <- read.csv(file.choose(), stringsAsFactors = F) head(gnewsheadlines)

head(sample(stop_words$headline, 15), 15) udmodel_english <- udpipe_load_model(file = "C:/Users/Palam/Documents/english-ewt-ud-2.5-191206.udpipe")

Step 2 – count the number of total headlines by date and plot the results to examine

gnewsheadlines %>% group_by(date) %>% count() %>% arrange(desc(n)) gnewsheadlines %>% group_by(date) %>% count() %>% ggplot() + geom_line(aes(date,n, group = 1))

headlinegoogle <- gnewsheadlines %>% filter(date >= "3/31/2022", date <= "4/3/2022")

head(headlinegoogle)

g <- udpipe_annotate(udmodel_english,headlinegoogle$headline) x <- data.frame(g)

I am getting this error while running the udpipe_annotate:

g <- udpipe_annotate(udmodel_english,headlinegoogle$headline) x <- data.frame(g) Error in [.data.table(out, , :=(c("token_id", "token", "lemma", "upos", : Supplied 10 columns to be assigned an empty list (which may be an empty data.table or data.frame since they are lists too). To delete multiple columns use NULL instead. To add multiple empty list columns, use list(list()). In addition: Warning message: In strsplit(x$conllu, "\n", fixed = TRUE) : input string 1 is invalid UTF-8

amrrs commented 2 years ago

Can you print g and see what's in it ?