N-gram frequency for a given n-gram in a given year was computed as the number of instances of that n-gram that year divided by the number of words in total used that year. This seems to me like working with two different units, n-grams and just words in general. Why wasn't n-gram frequency the number of instances of an n-gram divided by the number of n-grams that year (which isn't necessarily equal to the number of words)? This may be an amateurish technical question, but it confused me
N-gram frequency for a given n-gram in a given year was computed as the number of instances of that n-gram that year divided by the number of words in total used that year. This seems to me like working with two different units, n-grams and just words in general. Why wasn't n-gram frequency the number of instances of an n-gram divided by the number of n-grams that year (which isn't necessarily equal to the number of words)? This may be an amateurish technical question, but it confused me