charybdis-ircd / charybdis

Scalable IRCv3.2 server for large, community-oriented networks
GNU General Public License v2.0
231 stars 102 forks source link

strip_unprintable: Don't strip all bytes with the MSB set #286

Closed edk0 closed 4 years ago

aaronmdjones commented 4 years ago

Now the commit ID makes sense. :)

edk0 commented 4 years ago

For posterity, the argument for the first approach here:

<deadk> iirc, we're using that to check if values are smaller than 32 or something
<deadk> the line of reasoning is: values smaller than 128 must be non-negative 
signed chars, so the choice of negative number representation doesn't matter. apart 
from the sign bit, signed integer types are required to have bit-for-bit 
correspondence with their unsigned integer type, which means the only place the 
sign bit can go is MSB. due to said correspondence, if the value is in the 
non-negative 
<deadk> signed char range, reinterpreting it as unsigned char can't change the 
value. if it's in the negative signed char range, one of three things could happen 
to the value, but it will always be at least 128 since it has MSB set