require(quanteda)
S <- dictionary(list(japanese = c("Japaner". "Japanerin"),
korean = c("Koreaner", "Koreanerin")))
And then calculation the bias per word, i.e. Japaner/Japanerin; but aggregate to calculate the multinominal distribution of P by categories (i.e. japanese).
It would be better to allow S to be a dictionary:
And then calculation the bias per word, i.e. Japaner/Japanerin; but aggregate to calculate the multinominal distribution of P by categories (i.e. japanese).