Unicode input is not preserved on Windows

yandexx commented 5 years ago

Hello!

When inputting cyrillic password, such as "привет", the library panics. I assume this happens for other types of non-latin input. I have only tested on Windows 10.

fn main() {
    // Prompt for a password on STDERR
    let pass = rpassword::prompt_password_stderr("Password: ").unwrap();
    println!("Your password is {}", pass);
}

Password: thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Custom { kind: InvalidData, error: StringError("stream did not contain valid UTF-8") }', src\libcore\result.rs:1009:5
stack backtrace:
   0: std::sys::windows::backtrace::set_frames
             at /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b\/src\libstd\sys\windows\backtrace\mod.rs:104
   1: std::sys::windows::backtrace::set_frames
             at /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b\/src\libstd\sys\windows\backtrace\mod.rs:104
   2: std::sys::windows::backtrace::set_frames
             at /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b\/src\libstd\sys\windows\backtrace\mod.rs:104
   3: std::sys_common::backtrace::print
             at /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b\/src\libstd\sys_common\backtrace.rs:59
   4: std::sys_common::backtrace::print
             at /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b\/src\libstd\sys_common\backtrace.rs:59
   5: std::panicking::default_hook
             at /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b\/src\libstd\panicking.rs:227
   6: std::panicking::rust_panic_with_hook
             at /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b\/src\libstd\panicking.rs:491
   7: std::panicking::continue_panic_fmt
             at /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b\/src\libstd\panicking.rs:398
   8: std::panicking::rust_begin_panic
             at /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b\/src\libstd\panicking.rs:325
   9: core::panicking::panic_fmt
             at /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b\/src\libcore\panicking.rs:95
  10: core::result::unwrap_failed<std::io::error::Error>
             at /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b\src\libcore\macros.rs:26
  11: core::result::Result<alloc::string::String, std::io::error::Error>::unwrap<alloc::string::String,std::io::error::Error>
             at /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b\src\libcore\result.rs:808
  12: test_rpassword::main
             at .\src\main.rs:3

I also tried inputting Czech characters. Accented "ščř" get changed to simple "scr". Probably this is related.

Thanks!

conradkleinespel commented 5 years ago

Hey @yandexx and thanks for the bug report ! Feel free to open a PR if you have a working fix and corresponding test case :wink: also, would you mind sharing what version of Rust you're using ?

@equalsraf, would you happen to have run into this bug before ?

yandexx commented 5 years ago

That's on current stable 1.32 :) I don't have a PR but I might as well try and fix it.

conradkleinespel commented 5 years ago

@yandexx Hello ! Are you still interested in fixing this ? If not, I'll be closing this issue as unfortunately, I do not have time to invest in fixing this.

yandexx commented 5 years ago

Hello @conradkdotcom. I was busy with life, I'll try to have a look into this.

But in any case the issue cannot be closed since it's a bug. Someone else may try to fix it too.

conradkleinespel commented 5 years ago

@yandexx The panic "stream did not contain valid UTF-8" comes from another library, so if this is a bug, it seems like it is a bug in another library. Maybe this could have to do with the way you've configured your terminal as well. Here's what I get when I run env | grep LC_ in my terminal:

Do you see a similar output as well ?

yandexx commented 5 years ago

I've only run this on Windows, so the above isn't applicable.

yandexx commented 5 years ago

OK, interestingly, in rpassword 3.0.0 this no longer panics, but unicode is not preserved.

"привет" results in "??????"
"ěščřž" results in "escrz"

yandexx commented 5 years ago

What I noticed is rpassword obtains stdin from a raw Windows file handle.

I have then tried a minimal example where I use stdin from std, then from a handle, and the problem is the same there. I don't know whose fault it is.

use std::io;
use std::io::prelude::*;
use std::os::windows::io::{AsRawHandle, FromRawHandle};

fn main() {
    let mut input = String::new();

    let stdin = io::stdin();
    match stdin.read_line(&mut input) {
        Ok(n) => println!("Input is \"{}\", {} bytes read", input, n),
        Err(error) => println!("error: {}", error),
    }

    let mut stdin = io::BufReader::new(unsafe {
        ::std::fs::File::from_raw_handle(io::stdin().as_raw_handle())
    });
    match stdin.read_line(&mut input) {
        Ok(n) => println!("Input is \"{}\", {} bytes read", input, n),
        Err(error) => println!("error: {}", error),
    }
}

qwerйцукěščř
Input is "qwerйцукěščř
", 22 bytes read
qwerйцукěščř
Input is "qwerйцукěščř
qwer????escr
", 14 bytes read

conradkleinespel commented 5 years ago

Thanks for looking into this @yandexx ! I think Windows uses UTF-16 by default, although I'm nore 100% sure: I haven't done any Windows development. What surprises me most is that your output seems to say that with the RAW file handle, the reader stops at 14 bytes. Not sure why that is.

yandexx commented 5 years ago

Yes it's as if the encoding is lost somewhere on the way. We could maybe workaround that by acquiring the io::stdin() right after changing the console parameters. I'll give it a test.

yandexx commented 5 years ago

So with this fix below (against the current master) the input works as I would expect it, with prompt_password_stdout, prompt_password_stderr and read_password_from_tty.

All Unicode input is preserved. I've tested on Windows 10 x64 1809 (“modern console”) and 2012 R2 (legacy console).

Does this look good? Should I open a PR, or do we need any additional testing?

--- a/src/lib.rs
+++ b/src/lib.rs
@@ -164,8 +164,8 @@ mod unix {
 mod windows {
     extern crate winapi;
     extern crate kernel32;
-    use std::io::{self, BufRead, Write};
-    use std::os::windows::io::{FromRawHandle, IntoRawHandle};
+    use std::io::{self, Write};
+    use std::os::windows::io::{FromRawHandle, AsRawHandle};
     use self::winapi::winnt::{
         GENERIC_READ, GENERIC_WRITE, FILE_SHARE_READ, FILE_SHARE_WRITE,
     };
@@ -206,11 +206,9 @@ mod windows {
         }

         // Read the password.
-        let mut source = io::BufReader::new(unsafe {
-            ::std::fs::File::from_raw_handle(handle)
-        });
+        let source = io::stdin();
         let input = source.read_line(&mut password);
-        let handle = source.into_inner().into_raw_handle();
+        let handle = source.as_raw_handle();

         // Check the response.
         match input {

conradkleinespel commented 5 years ago

@yandexx Looks good to me, you can open a PR, yes, that'd be great ! I'll go ahead and merge straightaway, it looks like you tested it well, I trust that it works :+1:

conradkleinespel / rpassword

Unicode input is not preserved on Windows #31