Open bkdotcom opened 5 months ago
These parens aren't useless... am I doing something wrong?
class Utf8 { // Get Unicode code point of utf9-encoded character (without need for mbstring extension) public static function ord($char) { $ord = \ord($char[0]); if ($ord < 0x80) { return $ord; } elseif ($ord < 0xe0) { return ($ord - 0xc0 << 6) // UselessParentheses (invalid) + \ord($char[1]) - 0x80; } elseif ($ord < 0xf0) { return ($ord - 0xe0 << 12) // UselessParentheses (invalid) + (\ord($char[1]) - 0x80 << 6) // UselessParentheses (invalid) + \ord($char[2]) - 0x80; } elseif ($ord < 0xf8) { return ($ord - 0xf0 << 18) // UselessParentheses (invalid) + (\ord($char[1]) - 0x80 << 12) // UselessParentheses (invalid) + (\ord($char[2]) - 0x80 << 6) // UselessParentheses (invalid) + \ord($char[3]) - 0x80; } return false; } }
unit tests:
assertSame(97, Utf8::ord('a')); // 1-bype assertSame(169, Utf8::ord('©')); // 2-byte assertSame(65049, Utf8::ord('︙')); // 3-byte assertSame(128169, Utf8::ord('💩')); // 4-type
to be clear: without parens, PHP evaluates $ord - 0xc0 << 6 + \ord($char[1]) - 0x80 as ($ord - 0xc0) << (6 + \ord($char[1]) - 0x80)
$ord - 0xc0 << 6 + \ord($char[1]) - 0x80
($ord - 0xc0) << (6 + \ord($char[1]) - 0x80)
so... ($ord - 0xc0 << 6) + \ord($char[1]) - 0x80 the parens are very much necessary also tried: ($ord - 0xc0 << 6) + (\ord($char[1]) - 0x80) and $ord - 0xc0 << 6 + (\ord($char[1]) - 0x80)
($ord - 0xc0 << 6) + \ord($char[1]) - 0x80
($ord - 0xc0 << 6) + (\ord($char[1]) - 0x80)
$ord - 0xc0 << 6 + (\ord($char[1]) - 0x80)
Possibly similar to #1672, #1678
These parens aren't useless... am I doing something wrong?
unit tests:
to be clear: without parens, PHP evaluates
$ord - 0xc0 << 6 + \ord($char[1]) - 0x80
as($ord - 0xc0) << (6 + \ord($char[1]) - 0x80)
so...
($ord - 0xc0 << 6) + \ord($char[1]) - 0x80
the parens are very much necessary also tried:($ord - 0xc0 << 6) + (\ord($char[1]) - 0x80)
and$ord - 0xc0 << 6 + (\ord($char[1]) - 0x80)