Closed plastikfan closed 3 years ago
I think this is impossible to fix. Because the length of some emojis doesnt actually match its real length eg:
the length of 🛡️ is 3, why?
According to the link below, in unicode v13.1, that is supposed to be length 2.
https://stackoverflow.com/questions/45624030/display-unicode-emoji-in-powershell http://www.russellcottrell.com/greek/utilities/SurrogatePairCalculator.htm https://docs.microsoft.com/en-us/dotnet/api/system.globalization.stringinfo?view=net-5.0 https://docs.google.com/document/d/1pC7N32TnmDr2xzFW4HscA1DyAPPZnwILUH2_03UL6Jo/preview https://docs.microsoft.com/en-us/dotnet/standard/base-types/character-encoding-introduction#grapheme-clusters https://docs.microsoft.com/en-us/dotnet/standard/base-types/character-encoding-introduction https://peterscene.com/blog/grapheme-better-way-of-counting-characters/ https://manishearth.github.io/blog/2017/01/14/stop-ascribing-meaning-to-unicode-code-points/#grapheme-clusters https://hsivonen.fi/string-length/ https://www.unicode.org/reports/tr29/ https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/
For the shield emoji (🛡️, which is defined as length 2 in unicode spec 13.1, but in windows has length 3), the following code is able to return 2:
$symbol = "🛡️";
$enumerator = $symbol.EnumerateRunes();
$count = 0;
while ($enumerator.MoveNext()) {
$count++;
}
But we need a much more graceful and succinct way to get the count. This article suggests that:
$symbol.EnumerateRunes().Count()
but running this results in:
RuntimeException: Method invocation failed because [System.Text.Rune] does not contain a method named 'Count'.
at <ScriptBlock>, C:\Users\Plastikfan\dev\github\PoSh\Hoopz\Elizium.Loopz\Tests\Public\commands\Show-Signals.tests.ps1:30
PS: We need it to return 2, because that is the correct number of display characters that the shield takes up not the 3 that it is reported as having, causing subtle bugs, like this one being addressed, but even this doesn't fix the issue
Some experimntal code:
It 'emojis' {
# (new System.Globalization.StringInfo("e\u0301")).LengthInTextElements
$symbol = "🛡️"; # U+1F6E1
# $length = (new System.Globalization.StringInfo($symbol)).LengthInTextElements;
$utf16codepoints = $symbol.utf16
$stringInfo = [System.Globalization.StringInfo]::new($symbol);
$length = $stringInfo.LengthInTextElements;
# $stringInfo.
# $bytes = [System.Text.Encoding]::Unicode.GetBytes($symbol);
# $rune = [System.Text.Rune]::DecodeLastFromUtf16($symbol, )
$runes = $symbol.EnumerateRunes().ToString();
$enumerator = $symbol.EnumerateRunes();
$count = 0;
while ($enumerator.MoveNext()) {
$count++;
}
# $charCount = [System.Text.Encoding]::Default.GetCharCount($Symbol);
# Write-Host "--> BYTES: '$bytes'"
Write-Host "--> Length: $($symbol.Length)"
Write-Host "--> LengthInTextElements: $($length)"
Write-Host "--> Grapheme Count: $($count)"
Write-Host "--> SYMBOL-INFO: '$($stringInfo.String)'"
Write-Host "--> SYMBOL-INFO length: '$($stringInfo.String.Length)'"
Write-Host "--> SYMBOL code points: '$($utf16codepoints)'"
Write-Host "--> SYMBOL runes: '$($runes)'"
Write-Host "--> FIRST: $($symbol[0])"
Write-Host "--> SECOND: $($symbol[1])"
Write-Host "--> THIRD: $($symbol[2])"
# Write-Host "--> Char Count: '$charCount'"
}
It 'question' {
function get-GraphemeLength {
[OutputType([int])]
param(
[Parameter(Position = 0)]
[string]$Value
)
[System.Text.StringRuneEnumerator]$enumerator = $Value.EnumerateRunes();
[int]$count = 0;
while ($enumerator.MoveNext()) {
$count++;
}
return $count;
}
function show-Grapheme {
param(
[Parameter(Position = 0)]
[string]$Value
)
[int]$graphemeLength = get-GraphemeLength $Value;
Write-Host "Symbol: '$Value', length: '$($Value.Length)', grapheme length: '$graphemeLength'";
}
# "--", "---", "---"
# 012345678901234567890123456789
[string[]]$symbols = @("🛡️", "🏷️", "🎯");
[int]$graphemeLength = get-GraphemeLength $symbol;
Write-Host "Ruler: ->0123456789";
foreach ($sym in $symbols) {
show-Grapheme $sym;
}
# [string]$format = "Greetings: '{0}'";
[int]$width = 10;
foreach ($sym in $symbols) {
# Write-Host $($format -f $sym);
[int]$filler = $width - $sym.Length;
Write-Host "$([string]::new('*', $filler))$sym"
}
}
this issue is a FUCKING nightmare
In the screen-shot below:
... the signals 'PASTE-A' and 'PATTERN'
are of length 3 and causes rendering failure (see the gap).
Actually, there are other length 3 emojis in the signal list, but they occur on the dark lines so this gap is hidden.