Subsetting tweets relevant to my topics

I ran an STM analysis via the STM package on 56 topics from my dataset (dataset contains a number of tweets): XSTM <- stm(out$documents, out$vocab, K=56, max.em.its=75, init.type="Spectral", seed=8458159)

and plotted it. plot(XSTM, type="summary", xlim=c(0,.2))

Out of the 56 topics, there are 11 of them which are relevant to me. I want to subset all the tweets that are linked to these 11 topics from my original dataset. The only way I can think of is to manually get the most frequent words for all 11 topics: plot(XSTM, type="labels", topics=c(1,2,3,4,5,6,7,8,9,10,11)) Then manually write all these key words down and subset my original dataset so that I would get only tweets that contain at least one of them, like this for example: Dataset$Trump <- str_extract(Dataset$tweet_text, "Trump") Dataset$Hillary <- str_extract(Dataset$tweet_text, "Hillary") Dataset$president <- str_extract(Dataset$tweet_text, "president") Dataset_keywords <- Dataset %>% filter_at(vars(5,6,7), any_vars(. %in% c('Trump',"Hillary","president"))) The problem is that the example above only has three words - in reality, I have 11 topics and each has around 8 most frequent words as identified by STM, which gives me around 88 terms.

Does anyone have an easier way of identifying tweets relevant to my selected topics?

bstewart / stm

Subsetting tweets relevant to my topics #214