Can KenLM or Kneyser-Ney Estimation be adapted from modelling language to characters?
My application involves the distribution of characters (say, the digits of a telephone number or street addresses) rather than words which I would like to train .
I have a probability vector for each position based on LeNet5 and an incomplete corpus of valid phone numbers. I would like to model based on sequences of several characters rather than several words.
Can KenLM or Kneyser-Ney Estimation be adapted from modelling language to characters?
My application involves the distribution of characters (say, the digits of a telephone number or street addresses) rather than words which I would like to train .
I have a probability vector for each position based on LeNet5 and an incomplete corpus of valid phone numbers. I would like to model based on sequences of several characters rather than several words.