This also renames and reorganizes the UTF8Decoding module, which will now be called String.DecodeUTF8, and will host both the forward- and reverse-decoding capability for UTF-8 in a way that can be used by String or any other caller.
This commit also adds a lot of code comments with explanations and diagrams about how the DFA state machine works and why the particular state numbers were chosen as they were. This level of internal documentation was necessary to do the due diligence of fully understanding Hoehrmann's forward-decoding state machine as a prerequisite to working out our own reverse-decoding state machine using a similar approach. These organized notes should help others understand as well, as they explain things much more fully than Hoehrmann did at http://bjoern.hoehrmann.de/utf-8/decoder/dfa
This also renames and reorganizes the
UTF8Decoding
module, which will now be calledString.DecodeUTF8
, and will host both the forward- and reverse-decoding capability for UTF-8 in a way that can be used byString
or any other caller.This commit also adds a lot of code comments with explanations and diagrams about how the DFA state machine works and why the particular state numbers were chosen as they were. This level of internal documentation was necessary to do the due diligence of fully understanding Hoehrmann's forward-decoding state machine as a prerequisite to working out our own reverse-decoding state machine using a similar approach. These organized notes should help others understand as well, as they explain things much more fully than Hoehrmann did at http://bjoern.hoehrmann.de/utf-8/decoder/dfa