try these methods - Githubissues

using udpipe:

extract only nouns per episode
co-occurrences within each sentence or at word co-occurrences of words which are close in the neighbourhood of one another.

## Collocation (words following one another)
stats <- keywords_collocation(x = x, 
                             term = "token", group = c("doc_id", "paragraph_id", "sentence_id"),
                             ngram_max = 4)
## Co-occurrences: How frequent do words occur in the same sentence, in this case only nouns or adjectives
stats <- cooccurrence(x = subset(x, upos %in% c("NOUN", "ADJ")), 
                     term = "lemma", group = c("doc_id", "paragraph_id", "sentence_id"))
## Co-occurrences: How frequent do words follow one another
stats <- cooccurrence(x = x$lemma, 
                     relevant = x$upos %in% c("NOUN", "ADJ"))
## Co-occurrences: How frequent do words follow one another even if we would skip 2 words in between
stats <- cooccurrence(x = x$lemma, 
                     relevant = x$upos %in% c("NOUN", "ADJ"), skipgram = 2)

make network plot of result
extract only verbs ? see actionable words?

with textrank : -pagerank on words (only noun and adj for example)

use RAKE on sets / say 10 docs per time? (extracting the important topics in every episode.

RMHogervorst / NLP_SN

try these methods #1