0nkery / how2-rs

Simple CLI tool to retrieve answers from StackExchange.
MIT License
0 stars 0 forks source link

[Bug] I have not query output in Windows #4

Open Kristinita opened 7 years ago

Kristinita commented 7 years ago

Итог

Не получается выполнить запросы.

Желаемое поведение

Успешный вывод на мои запросы.

Actual behavior

И так с любым запросом.

E:\how2-rs\build\windows_i686>how2-rs PHP
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', ../src/libcore\option.rs:326
note: Run with `RUST_BACKTRACE=1` for a backtrace.

E:\how2-rs\build\windows_i686>how2-rs PHP RUST_BACKTRACE=1
The RUST_BACKTRACE environment variable will give you a backtrace. Try

$ RUST_BACKTRACE=1 ./target/ecmiwc -i www.google.com -u testuser -v test

===-----===
There is a pull request merged into main rust repo which adds file names and line numbers to backtrace. As far as I can see it was a part of rust 1.0.0 stable release.

You have to enable backtraces and build executable using cargo profile which includes debug symbols into executable (with debug = true option in cargo manifest). AFAIK cargo run is using debug profile by default now.

Here is example trace output with file names and line numbers:

[user@salikhov ~/workspace/mqtt-rust $ RUST_BACKTRACE=1 cargo run
   Compiling mqtt v0.1.0 (file:///home/user/workspace/mqtt-rust)

     Running `target/debug/mqtt`
thread '<main>' panicked at 'I want line numbers!', src/proto/client.rs:33
stack backtrace:
   1:     0x7ff049fa47d9 - sys::backtrace::tracing::imp::write::he18882fa84e6b00ePnt
   2:     0x7ff049fa39b8 - panicking::on_panic::h495226a97f084523enx
   3:     0x7ff049f9dcce - sys_common::unwind::begin_unwind_inner::h7a4ee06c0d57e26affs
   4:     0x7ff049f95f47 - sys_common::unwind::begin_unwind::h13029855766851973181
                        at ../src/libstd/sys/common/unwind/mod.rs:232
   5:     0x7ff049f95e8a - proto::client::MqttConnection::connect::h633d3d42c15a3dedgYa
                        at /home/user/workspace/mqtt-rust/<std macros>:3
   6:     0x7ff049f80416 - main::h1d77c75265710f92gaa
                        at src/main.rs:5
   7:     0x7ff049fa6084 - sys_common::unwind::try::try_fn::h4848098439110500489
   8:     0x7ff049fa3098 - __rust_try
   9:     0x7ff049fa5cf8 - rt::lang_start::hcf64c98c1a7c0031Zkx
  10:     0x7ff049f834f6 - main
  11:     0x7ff049170ec4 - __libc_start_main
  12:     0x7ff049f802a8 - <unknown>
  13:                0x0 - <unknown>
An unknown error occurred

Unfortunately, this is broken on some platforms like MacOS X. There is open issue about this in rust github issue tracker.

===-----===
If you actually look at the headers that Google returns:

HTTP/1.1 200 OK
Date: Fri, 22 Jul 2016 20:45:54 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See https://www.google.com/support/accounts/answer/151657?hl=en for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Set-Cookie: NID=82=YwAD4Rj09u6gUA8OtQH73BUz6UlNdeRc9Z_iGjyaDqFdRGMdslypu1zsSDWQ4xRJFyEn9-UtR7U6G7HKehoyxvy9HItnDlg8iLsxzlhNcg01luW3_-HWs3l9S3dmHIVh; expires=Sat, 21-Jan-2017 20:45:54 GMT; path=/; domain=.google.ca; HttpOnly
Alternate-Protocol: 443:quic
Alt-Svc: quic=":443"; ma=2592000; v="36,35,34,33,32,31,30,29,28,27,26,25"
Accept-Ranges: none
Vary: Accept-Encoding
Transfer-Encoding: chunked

You can see

  Content-Type: text/html; charset=ISO-8859-1

Additionally

  Therefore, the error must be caused by me incorrectly converting the byte sequence into an UTF-8 string.

There is no conversion to UTF-8 happening. read_to_string simply ensures that the data is UTF-8.

Simply put, assuming that an arbitrary HTML page is encoded in UTF-8 is completely incorrect. At best, you have to parse the headers to find the encoding and then convert the data. This is complicated because there's no real definition for what encoding the headers are in.

Once you have found the correct encoding, you can use a crate such as encoding to properly transform the result into UTF-8, if the result is even text! Remember that HTTP can return binary files such as images.

===-----===
We can confirm with the iconv command that the data returned from http://www.google.com is not valid UTF-8:

$ wget http://google.com -O page.html
$ iconv -f utf-8 page.html > /dev/null
iconv: illegal input sequence at position 5591

For some other urls (like http://www.reddit.com) the code works fine.

If we assume that the most part of the data is valid UTF-8, we can use String::from_utf8_lossy to workaround the problem:

pub fn print_html(url: &str) {
    let client = Client::new();
    let req = client.get(url).send();

    match req {
        Ok(mut res) => {
            println!("{}", res.status);

            let mut body = Vec::new();

            match res.read_to_end(&mut body) {
                Ok(_) => println!("{:?}", String::from_utf8_lossy(&*body)),
                Err(why) => panic!("String conversion failure: {:?}", why),
            }
        }
        Err(why) => panic!("{:?}", why),
    }
}

Note that that Read::read_to_string and Read::read_to_end  return Ok with the number of read bytes on success, not the read data.

===-----===
I asked about this in #rust-internals, and sfackler said

  I believe it has no effect except during a panic

===-----===

E:\how2-rs\build\windows_i686>

Шаги для воспроизведения

Клонировал репозиторий c how2-rs → в папке, где находится файл how2rs.exe, запустил cmd.exe → получаю ошибку.

Программно-аппаратное обеспечение

Операционная система: Windows 32-bit 10.0.14393

Спасибо.

Kristinita commented 7 years ago

Проблема воспроизводится на:

Спасибо.

0nkery commented 7 years ago

Your second query how2-rs PHP RUST_BACKTRACE=1 emitted some results.

0nkery commented 7 years ago

Я сделал ошибки более читаемыми и понятными. Попробуйте собрать проект по новой (предварительно обновив).

PS. На Linux, к слову, таких ошибок не возникает.