pickettj / eurasia_corpus_tool

Distant reading of nested corpuses of Eurasian texts.
0 stars 0 forks source link

better KWIC tool #2

Open pickettj opened 4 years ago

pickettj commented 4 years ago

@iamlemec

I want to write a better key words in context tool, and was hoping you might have some tips for the following:

(1) Essentially, I want to do something like this R function but in Python. Currently, I use raw tokens combined from many texts to view key words in context. But it would be nice to retain information about the text they originate from (i.e. the dictionary key, since I have dictionaries of word tokens as data).

(2) There must be a fairly straightforward way of viewing two successive tokens in a row in context.

iamlemec commented 4 years ago

I think in your Custom KWIC (beta) section, you should be able to do: [x for x in five_grams if x[2] == "پانصد"] Then it's just a matter of formatting x into a proper string and potentially making the conditional more sophisticated.