PowerShell / PSReadLine

A bash inspired readline implementation for PowerShell
BSD 2-Clause "Simplified" License
3.7k stars 294 forks source link

Some emoji, eg 'πŸ˜ƒ' don't work, whereas others like '⏰' do #1329

Open mikemaccana opened 5 years ago

mikemaccana commented 5 years ago

Moved from https://github.com/microsoft/terminal/issues/1606

Using app store preview build of Windows Terminal, with win+. to send πŸ˜ƒ and then ⏰:

image

@DHowett-MSFT mentioned it's likely not a Windows Terminal issue:

Powershell has trouble with high unicode input ... and there's not much we can do to help that.


Name                           Value
----                           -----
PSVersion                      6.2.1
PSEdition                      Core
GitCommitId                    6.2.1
OS                             Microsoft Windows 10.0.18362
Platform                       Win32NT
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0
vexx32 commented 5 years ago

I wonder if this is due to PS itself or PSReadLine...

Can you Remove-Module PSReadline and try it again?

mikemaccana commented 5 years ago

After Remove-Module PSReadline

image

vexx32 commented 5 years ago

Interesting. Input doesn't like it much, but output is OK.

I'd venture to say this is probably an issue both for PS itself in places and the PSReadLine module as well,

@daxian-dbw any thoughts on this?

iSazonov commented 4 years ago

/cc @daxian-dbw @SteveL-MSFT Should we track the issue here or in PSREadline repo?

SteveL-MSFT commented 4 years ago

Seems like there is an issue in both PS and PSRL. We can keep this issue here for now. Removing PSRL and on macOS, simply pasting "πŸ˜ƒ results in οΏ½"πŸ˜ƒοΏ½. Quite strange.

SteveL-MSFT commented 4 years ago

@chuanjiao10 that may be true on Windows, but on macOS there is still an issue although perhaps different from the original issue

daxian-dbw commented 4 years ago

In Windows Terminal, garbage characters may appear when deleting the emoji character.

iSazonov commented 4 years ago

@daxian-dbw Is any updates in PSRL with the issue?

daxian-dbw commented 4 years ago

@iSazonov Unfortunately, no news from the PSReadLine side. also, as mentioned in this issue, the same repros without PSReadLine.

DHowett-MSFT commented 4 years ago

PSReadline uses raw input and non-PSReadline uses COOKED_READ in the console host. One of these things can never be fixed :smile:

iSazonov commented 4 years ago

I'd prefer to get updated COOKED_READ in Core.

DHowett-MSFT commented 4 years ago

Cooked read is provided by conhost.exe, not Core, and we cannot change it without significantly impacting application compatibility all across Windows. Raw input with a readline-like library is the correct thing to do.

iSazonov commented 4 years ago

Raw input with a readline-like library is the correct thing to do.

I mean that this must be in Core so that we do not re-implement this in every application. Is there a tracking issue?

lzybkr commented 4 years ago

@iSazonov - see https://github.com/PowerShell/PSReadLine/issues/1045

iSazonov commented 4 years ago

I opened https://github.com/dotnet/runtime/issues/800, maybe not entirely correct, but the presence of these features in Core seems desirable.

daxian-dbw commented 4 years ago

@DHowett-MSFT You mentioned in microsoft/terminal#1606 that:

Powershell has trouble with high unicode input ... and there's not much we can do to help that.

For all high unicode emoji input, such as πŸ˜€ (surrogates: D83D, DE04), Console.Readkey() seems to only return a high surrogates character, like ConsoleKeyInfo { Key = 18, KeyChar = '\ud83d', Modifiers = 0 }, but not the low surrogates. Could you please provide me some pointers on how to correctly read all surrogates? An in general, can you please provide me some pointers on how should a readline library handle emojis?

SeeminglyScience commented 4 years ago

@daxian-dbw it's sent as two separate keys. If you do this you'll see both

while ($true) {
    '0x{0:X4}' -f [int][Console]::ReadKey($true).KeyChar
}
SeeminglyScience commented 4 years ago

Also, I think escape sequences are being written between the surrogates:

[Console]::WriteLine("`u{D83D}`u{DE00}")
# Shows emoji

[Console]::WriteLine("`u{D83D}`e[30m`u{DE00}")
# Shows two separate question mark characters
daxian-dbw commented 4 years ago

[Console]::WriteLine("u{D83D}u{DE00}") shows 2 question mark character for me in both Windows terminal and legacy console host:

image

SeeminglyScience commented 4 years ago

Yeah you gotta change output encoding first:

[Console]::OutputEncoding = [Text.Encoding]::Unicode
iSazonov commented 4 years ago

Oh, is it a time to change [Console]::OutputEncoding to Utf8 for PowerShell?

SeeminglyScience commented 4 years ago

Also Delete should delete two characters if the target is a high surrogate. @daxian-dbw let me know if you want a new issue for that.

daxian-dbw commented 4 years ago

@SeeminglyScience Yes, please open a new issue for Delete. Thanks!

daxian-dbw commented 4 years ago

For me later: https://github.com/microsoft/terminal/issues/1503#issuecomment-605311026

rayphi commented 4 years ago

I've got also an Exception for a πŸ’„in my commit message.

Exception:

System.Text.EncoderFallbackException: Unable to translate Unicode character \\uD83D at index 10 to specified code page.
   at System.Text.EncoderExceptionFallbackBuffer.Fallback(Char charUnknown, Int32 index)
   at System.Text.EncoderFallbackBuffer.InternalFallback(ReadOnlySpan`1 chars, Int32& charsConsumed)
   at System.Text.Encoding.GetBytesWithFallback(ReadOnlySpan`1 chars, Int32 originalCharsLength, Span`1 bytes, Int32 originalBytesLength, EncoderNLS encoder)
   at System.Text.Encoding.GetBytesWithFallback(Char* pOriginalChars, Int32 originalCharCount, Byte* pOriginalBytes, Int32 originalByteCount, Int32 charsConsumedSoFar, Int32 bytesWrittenSoFar, EncoderNLS encoder)
   at System.Text.Encoding.GetBytes(Char* pChars, Int32 charCount, Byte* pBytes, Int32 byteCount, EncoderNLS encoder)
   at System.Text.EncoderNLS.GetBytes(Char[] chars, Int32 charIndex, Int32 charCount, Byte[] bytes, Int32 byteIndex, Boolean flush)
   at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
   at System.IO.StreamWriter.Dispose(Boolean disposing)
   at System.IO.TextWriter.Dispose()
   at Microsoft.PowerShell.PSConsoleReadLine.<>c__DisplayClass81_0.<WriteHistoryRange>b__0()
   at Microsoft.PowerShell.PSConsoleReadLine.WithHistoryFileMutexDo(Int32 timeout, Action action)
   at Microsoft.PowerShell.PSConsoleReadLine.WriteHistoryRange(Int32 start, Int32 end, Func`2 fileOpener)
   at Microsoft.PowerShell.PSConsoleReadLine.IncrementalHistoryWrite()
   at Microsoft.PowerShell.PSConsoleReadLine.MaybeAddToHistory(String result, List`1 edits, Int32 undoEditIndex, Boolean fromDifferentSession, Boolean fromInitialRead)
   at Microsoft.PowerShell.PSConsoleReadLine.InputLoop()
   at Microsoft.PowerShell.PSConsoleReadLine.ReadLine(Runspace runspace, EngineIntrinsics engineIntrinsics, CancellationToken cancellationToken)
chenshuai2144 commented 4 years ago

I have same error image

ninmonkey commented 3 years ago

This caused by PSReadLine itself

Error Case:

  1. Create a filename that requires surrogate pairs, include a πŸ–₯️ or πŸ˜€
  2. copy a path to your clipboard
  3. type gi ' then paste the path

File not found

I verified gi (get-clipboard) does have the right path.

Working Case: Remove-Module

  1. run Remove-Module PSReadLine
  2. Paste like the first test

Get-Item does find the file

image

Notes:

It doesn't seem to matter whether the 3 encodings are set to utf16le vs utf8 [console]::InputEncoding, [console]::OutputEncoding, $OutputEncoding

Even though the terminal didn't render the runes correctly, the paste still was the valid path image

StevenBucher98 commented 1 year ago

Revisiting this issue, I dont seem to get any errors copying and pasting large sets of emojis in Windows Terminal with PowerShell 7.3.1 and PSReadLine 2.2.6

image

Without windows Terminal it seems to be okay but has little different rendering and a few ??

image

With Mac it gets a little messy with more than like 2 emojis, see screen recording

Uploading Screen Recording 2023-01-04 at 9.14.48 AM.mov…

UsmanTariq2 commented 7 months ago

Has this issue been fixed? Whenever i enter a emoji like πŸ€‘the windows terminal shows οΏ½ οΏ½ instead of rendering the proper emoji in the terminal when inputting an emoji The information provided here is confusing me that is why asking.