Closed diegoiast closed 1 year ago
(ignoring the conflict)
Is this PR still valid? I fixed some crashes on my side.
Sorry for the late reply. The idea of the latest commits is to be encoding agnostic (by storing the string received over socket and it's encoding) and not to assume ASCII or UTF8. Let me try your PR with the internal tests and how it fits to the current state of the code. The topic is not trivial, especially when different platforms are considered.
Considering char8_t
and u8string
and also not having more similar reports of failures, I will skip merging this PR.
rfc822 says that email should be ASCII/Latin1 - but in reality, I see from gmail cp1255 - andprobably other 8ibt encodings. Which are compatible with latin1... so, this happens on the field. I am unsure if this is the best way to do this - I could not find a way get a
uchar
from andstd::string
.The C standard does not define how
isalpha()
behaved when we pass it a negative number. It deals with ASCII only. GLIBC tries to handle this by testing it as the current locale, which is... not something the standard demands. MSVC is more strict - it just throws.So - all these functions need to have a
uch
value - ugly, and simple solution.Some RTFM: https://news.ycombinator.com/item?id=28703525 https://drewdevault.com/2020/09/25/A-story-of-two-libcs.html