Go to Language/Word Context. Then select "similarity" from the measure drop down. Then search for a word in the corpus. The app will return a scatter plot for word most associated to the search word (according to word2vec and cosign similarity. see line 102). If you click on one of those scatter plot points and wait for ~9 seconds a data frame will pop up with the word's keyword in context (KWIC).
Obviously, it's a problem that it takes ~9 seconds for results to return. Can we optimize KWIC so it returns results in a reasonable amount of time?
Go to Language/Word Context. Then select "similarity" from the measure drop down. Then search for a word in the corpus. The app will return a scatter plot for word most associated to the search word (according to word2vec and cosign similarity. see line 102). If you click on one of those scatter plot points and wait for ~9 seconds a data frame will pop up with the word's keyword in context (KWIC).
Obviously, it's a problem that it takes ~9 seconds for results to return. Can we optimize KWIC so it returns results in a reasonable amount of time?
Here's the KWIC code: https://github.com/stephbuon/hansard-shiny/tree/main/app/modules/kwic
It's called by: https://github.com/stephbuon/hansard-shiny/blob/main/app/modules/word-context/word_context.R
Caching the results (kwick_cache.R) obviously allows us to return results in real time, however, I don't know if we would generate too much cache.
You'll see that I am borrowing a function from Quanteda (this one: https://quanteda.io/reference/kwic.html)