Let's add an ANS implementation so we can really showcase the difference between BWD and other dictionary-symbolwise compressors like gzip.
The steps to this would be to create a system for chaining models and a way to output their predictions to the entropy coder. This is just to abstract from any entropy coder implementation, would it be Huffman, arithmetic coding, or rANS. And also to abstract from modeling techniques - n-order models, delayed modeling, context mixing, etc.
Let's add an ANS implementation so we can really showcase the difference between BWD and other dictionary-symbolwise compressors like gzip.
The steps to this would be to create a system for chaining models and a way to output their predictions to the entropy coder. This is just to abstract from any entropy coder implementation, would it be Huffman, arithmetic coding, or rANS. And also to abstract from modeling techniques - n-order models, delayed modeling, context mixing, etc.