Open sunaynagoel opened 4 years ago
Another question
How can I capture the output of this code into a new table with two columns (with headers- name, frequency ).
#Top 10 token without stemming
tokens %>% dfm( stem=F ) %>% topfeatures( )
The output is
natural organic products brands ingredients skincare fragrance 22 17 8 6 6 5 5 might using want 5 4 4
Regarding loading data, you would save the file to a specific directory, then set that as your working directory and use readLines()
to read in text files.
setwd( "C:/Users/Documents/TextAnalysis" ) # wherever your file is located
x <- readLines( "dear_john_letter_1.txt", warn=FALSE )
The output of topfeatures(() is a numeric vector with name attributes. You can convert it to a data frame in a few ways.
You can create a new data frame and assign the vector names as one column and the vecture value as another:
tf <- tokens %>% dfm( stem=F ) %>% topfeatures( )
data.frame( name=names(tf), freq=tf, row.names=NULL )
name freq
1 provide 251
2 community 209
3 support 156
4 mission 144
5 education 142
6 youth 125
7 organization 117
8 educational 114
9 children 104
10 school 100
Since tables convert nicely to data frames you can also double-cast the numeric vector:
as.data.frame( as.table( tf ) )
Var1 Freq
1 provide 251
2 community 209
3 support 156
4 mission 144
5 education 142
6 youth 125
7 organization 117
8 educational 114
9 children 104
10 school 100
Note that topfeatures() outputs the first ten results. You can as for as many as you would like:
tf <- tokens %>% dfm( stem=F ) %>% topfeatures( n=100 )
@lecy thank you so much. These really help a lot.
What is good way to load a local .txt to R for text analysis with making a working directory?
file.choose() forces me to choose a file every time I run the code.
I tried read.table() as well with full path but it changes the data in the file, not sure why.