Most invocations of _convert are guaranteed to not handle digits in charset C. Handling extra digits in charset C introduces extra complexity due to grouping of pairs of digits, which is handled by _buffer. (I rename _buffer to _digit_buffer for clarity.)
In order to isolate the complexity and ease type hinting, I restrict _convert to return only int and raise an informative exception whenever the buffer is triggered. Correspondingly I introduce a function _convert_or_buffer returning int | None to handle the full case when _digit_buffer might be used.
There is just one single place (_build) where _convert_or_buffer is necessary rather than _convert.
Reasons why _convert is safe to use in all other invocations:
In _new_charset it's used to switch charset, so it won't be used on a digit.
In _maybe_switch_charset and _build it's used to flush the buffer immediately after switching to either charset A or B, so the charset cannot be C.
In _convert_or_buffer I explicitly check before invoking that the charset isn't C.
This makes it clear why only the invocation in _build needs to handle the None return type. (IMO it's pretty tricky to deduce this if you're not already familiar with the implementation.)
Most invocations of
_convert
are guaranteed to not handle digits in charset C. Handling extra digits in charset C introduces extra complexity due to grouping of pairs of digits, which is handled by_buffer
. (I rename_buffer
to_digit_buffer
for clarity.)In order to isolate the complexity and ease type hinting, I restrict
_convert
to return onlyint
and raise an informative exception whenever the buffer is triggered. Correspondingly I introduce a function_convert_or_buffer
returningint | None
to handle the full case when_digit_buffer
might be used.There is just one single place (
_build
) where_convert_or_buffer
is necessary rather than_convert
.Reasons why
_convert
is safe to use in all other invocations:_new_charset
it's used to switch charset, so it won't be used on a digit._maybe_switch_charset
and_build
it's used to flush the buffer immediately after switching to either charset A or B, so the charset cannot be C._convert_or_buffer
I explicitly check before invoking that the charset isn't C.This makes it clear why only the invocation in
_build
needs to handle theNone
return type. (IMO it's pretty tricky to deduce this if you're not already familiar with the implementation.)