Closed GoogleCodeExporter closed 8 years ago
I think I've fixed this in r508.
An empty string has no characters, therefore it can have no character encoding.
It follows that anywhere we do:
$encodedOutput = mb_convert_encoding("", SOME_CHARACTER_ENCODING);
we could simply do:
$encodedOutput = '';
The reason that '' wasn't being handled as desired (i.e. ignoring the encoded
portion
of a string or, more accurately, forcing the removal of the encoded portion
from the
normalized input string and adding nothing to the decoded string) was because in
Codec.decode an empty string was being treated the same way as null.
'' == null is true.
The fix is simply to:
if ($decodedCharacter !== null)
and then we can return an empty string whenever we want to strip a character
from the
string. Again, it's not necessary to apply character encoding to an empty
string.
As a side note: the only reason that returning a single space was a good
workaround
for this issue was because it was returned as a single byte character and the
conversion from UTF-32 yielded an empty string.
Original comment by jahboite@gmail.com
on 16 Feb 2010 at 4:12
Original issue reported on code.google.com by
coreform
on 2 Dec 2009 at 4:26