Closed vince62s closed 1 year ago
I understand the mechanic, but what is the reason behind such a behavior ?
Better compression
sorry to insist but if this is the only reason why in the middle of text this would not be considered as "better compression" ?
Hello,
How do we explain the fact that double line break is encoded with a single ID (628) at the end of a sentence and tokenized in two double ID (198) when in the middle of text.
Thanks.