I'm curious if it would be possible to build a storage-efficient lm.diff file to patch an older lm.binary file into a newer one. I've experimented with some existing binary diff tools and have found the lm.diff file to be roughly the size of the new lm.binary after compression, but could a smarter tool be built for the kenlm model?
In theory this is possible but you'd be digging into smoothing algorithms because the discount parameters impact probability globally. And the quantizer is free to move centers. Possible but annoying.
I'm curious if it would be possible to build a storage-efficient lm.diff file to patch an older lm.binary file into a newer one. I've experimented with some existing binary diff tools and have found the lm.diff file to be roughly the size of the new lm.binary after compression, but could a smarter tool be built for the kenlm model?