ngrams have different scoring levels

Hi there just curious, this isn't really a bug...

How come when compared 2-grams vs 3-grams their scoring are not normalized.

The 2-grams will typically (and the majority of the time) have higher scores then the 3-grams.

This becomes problematic when trying to compare scores between 2-gram and 3-grams outputs.

Any insight would be great, perhaps with detailed explanation I can fix the issue and submit a pull.

kpu / kenlm