nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
104.51k stars 28.19k forks source link

Invalid characters in stdout �� #52944

Closed GooseOb closed 2 weeks ago

GooseOb commented 2 weeks ago

Version

v21.7.3

Platform

WSL2, Manjaro Linux, also tried with git bash

Subsystem

No response

What steps will reproduce the bug?

process.stdout.write large string of utf-8 characters

How often does it reproduce? Is there a required condition?

No response

What is the expected behavior? Why is that the expected behavior?

not replacing characters with ��

What do you see instead?

��

Additional information

Cannot reproduce using bun.

Related: GooseOb/taraskevizer#5, fsouza/prettierd#694

RedYetiDev commented 2 weeks ago

Hi! Can you give some example code to reproduce this? My initial assumption is that the encoding is mismatched between UTF-8 and UTF-16 in your terminal.

GooseOb commented 2 weeks ago

I tried just to read file and print its content and it works. But if I read from stdin, it makes �� (first is on 698 line). Am I doing something wrong?

latest_be_by.txt

script.mjs

#!/usr/bin/env node

let text = '';
if (!process.isTTY) for await (const chunk of process.stdin) text += chunk;

process.stdout.write(text);
./script.mjs < latest_be_by.txt > be_tarask_by_2.txt
RedYetiDev commented 2 weeks ago

What happens if you replace process.stdout.write(text); with process.stdout.write(text, 'utf-8');?

GooseOb commented 2 weeks ago

Nothing changed. I think the problem is with reading from stdin, not with writing.

console.log(/��/.test(text));
// true
RedYetiDev commented 2 weeks ago

It's possible it's reading in an incorrect encoding, I'm not quite sure, but I'll label this as help wanted so someone else will take a look.

GooseOb commented 2 weeks ago

Thank you, I found a right way to read from stdin and it works now