seanmonstar / reqwest

An easy and powerful Rust HTTP Client
https://docs.rs/reqwest
Apache License 2.0
9.72k stars 1.09k forks source link

wasm: `Unhandled Promise Rejection: Error: url parse` on cross-origin `no-cors` response #1401

Open lilyball opened 2 years ago

lilyball commented 2 years ago

If I issue a request to another origin that does not understand CORS, and I set fetch_mode_no_cors(), I end up with the following in my console:

[Error] Unhandled Promise Rejection: Error: url parse
    (anonymous function) (index-4817df6319a5b69d.js:635)
    wasm-stub
    <?>.wasm-function[wasm_bindgen::throw_str::h0bf8053b15d021f8]
    <?>.wasm-function[<core::result::Result<T,E> as wasm_bindgen::UnwrapThrowExt<T>>::expect_throw::hd6589de1dc146b56]
    <?>.wasm-function[reqwest::wasm::client::fetch::{{closure}}::h5058052b8d798b38]
    <?>.wasm-function[<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h10dd69cb7e5812c8]
    <?>.wasm-function[reqwest::wasm::request::RequestBuilder::send::{{closure}}::h1710f36b1672ff93]
    <?>.wasm-function[<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h8c4e0bd14ab13c16]
    <?>.wasm-function[redacted]
    <?>.wasm-function[<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hdf4e833184ee9556]
    <?>.wasm-function[redacted]
    <?>.wasm-function[<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hfaf164f5518cc6d5]
    <?>.wasm-function[<yew::html::component::scope::Scope<COMP> as yewtil::future::LinkFuture>::send_future::{{closure}}::hdf65ac3822f47f72]
    <?>.wasm-function[<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h290f2a0b8766693d]
    <?>.wasm-function[wasm_bindgen_futures::task::singlethread::Task::run::hfc6dc0a03e6a4a17]
    <?>.wasm-function[wasm_bindgen_futures::queue::QueueState::run_all::h28aa1ce5d7b80fb8]
    <?>.wasm-function[wasm_bindgen_futures::queue::Queue::new::{{closure}}::hdc226008ddda3af8]
    <?>.wasm-function[<dyn core::ops::function::FnMut<(A,)>+Output = R as wasm_bindgen::closure::WasmClosure>::describe::invoke::he3e689315c96fc66]
    wasm-stub
    2603
    __wbg_adapter_27 (index-4817df6319a5b69d.js:222:135)
    real (index-4817df6319a5b69d.js:191)
    promiseReactionJob

It looks like the issue is the line

let url = Url::parse(&js_resp.url()).expect_throw("url parse");

I can see in my browser console that the request was fine. Attempting to reproduce this request in JavaScript using the fetch() API shows that I get an opaque response whose url is an empty string.

Failure to parse the URL should not throw an unhandled exception. It's hard to debug (and it leaks the stack), and this is an error that will always happen for opaque filtered responses.

Additional Context

I am not a web dev. I know very little about CORS. The fact that I get back an opaque filtered request is something I learned while trying to debug this.

target: wasm32-unknown-unknown

reqwest v0.11.7
├── reqwest feature "__tls"
├── reqwest feature "default"
├── reqwest feature "default-tls"
├── reqwest feature "hyper-tls"
├── reqwest feature "json"
├── reqwest feature "native-tls-crate"
├── reqwest feature "serde_json"
└── reqwest feature "tokio-native-tls"
seanmonstar commented 2 years ago

Failure to parse the URL should not throw an unhandled exception. It's hard to debug (and it leaks the stack), and this is an error that will always happen for opaque filtered responses.

I'm not familiar enough with the fetch API to know much about what you said. I'm happy to support making the situation better. Do you have suggestions?

lilyball commented 2 years ago

@seanmonstar Since I'm not a web dev, I haven't run into this before. Here's a quick reproduction from JS to show the response:

  1. Open a new window to about:blank
  2. Open the web inspector
  3. Run the following:
    fetch("http://www.example.com", { mode: 'no-cors' }).then(resp => console.log(resp))

This will log a value that looks like

Response {
    body: null,
    bodyUsed: false,
    headers: Headers {append: function, delete: function, get: function, has: function, set: function, …},
    ok: false,
    redirected: false,
    status: 0,
    statusText: "",
    type: "opaque",
    url: "",
}

Looking up the details of type: "opaque", this seems to be an opaque filtered response. Digging into the fetch steps, it looks like a no-cors request to a different origin sets the response tainting to "opaque" (§4.1 Main fetch, item 11), and then subsequently the "opaque" response tainting causes the response to be returned as an opaque filtered response (§4.1 Main fetch, item 13).


Ultimately, the reason why this happens doesn't matter, just the fact that it's a legitimate state does. The quick fix is to change the .expect_throw("url parse") into a reqwest::Error instead that gets returned. This should probably be done by checking the "type" field of course, rather than assuming this is due to a url parse failure. Or it could be done by checking if the url is an empty string, but checking for the opaque type seems more correct.

The downside of course is that a no-cors request to a different origin now always returns an error. Granted, we don't actually know what the status of the request is, but ideally we'd have a structured representation of this fact. I don't see how we can get that though without making no-cors requests distinct at the type level. This could be done by adding typestate to RequestBuilder and Request, but that's honestly not great (beyond being more complicated, it also means the types will differ in WASM rather than just the available methods). So maybe the best thing to do really is to just keep it as an error, add an Error::is_opaque_response() method (perhaps with a cors-related #[doc(alias = …)]), and document this on fetch_mode_no_cors().

duckfromdiscord commented 1 year ago

having this issue on the reqwest_wasm version here https://github.com/samdenty/reqwest . any workaround?

nomyTx commented 1 year ago

any updates on this? i use reqwest in my library and this issue makes it not work in wasm

duckfromdiscord commented 1 year ago

now using this version of reqwest and not reqwest_wasm and still having the same issue.

duckfromdiscord commented 1 year ago

So maybe the best thing to do really is to just keep it as an error, add an Error::is_opaque_response() method (perhaps with a cors-related #[doc(alias = …)]), and document this on fetch_mode_no_cors().

@lilyball Won't this mean that no-cors requests still throw an error? The goal should be skipping url parsing or whatever is causing the error and still returning whatever the response was. Do you have a suggestion for how to do that ?

lilyball commented 1 year ago

@duckfromdiscord The response is fully opaque, so the only useful information to be gained here is "this response is opaque". I do agree that doing this as part of the error sucks, but putting it into Response means making the url optional (or allowing an empty string to parse as a url, which sucks for a different reason). This is why I raised the possibility of typestate in my previous comment, but that adds additional complexity.

Making it as a separate error with a simple test function, and fully documenting this (especially on fetch_mode_no_cors()), is the simplest approach that makes the library usable, even if it's not all that ergonomic.

Also FWIW, I've moved on from the project that motivated filing this issue and so I don't have anything invested in it anymore. I would like to see it fixed but the manner in which it's fixed won't affect me.

sxlijin commented 2 months ago

+1 we're also running into this issue

Repro in https://github.com/seanmonstar/reqwest/issues/1401#issuecomment-987524762 seems pretty straightforward to me.

Primary question for me is how to expose this in the reqwest API - I don't know what a good way to model this is.

victoryforphil commented 2 weeks ago

Bump, any work around for this found? Blocking any use of this in a WASM context.

seanmonstar commented 2 weeks ago

My information and knowledge is the same. I welcome a fix to the issue. I don't use WASM myself, so it would require someone else to investigate and find a flexible fix.

jesseditson commented 2 weeks ago

I think there may be some confusion here around what should and shouldn't work in WASM - assuming you're using something like wasm-bindgen, your rust lib is compiled to a WASM binary that will only ever run in a single threaded environment. Reqwest is built to abstract http methods in a normal multi-threaded environment and so abstracting back to fetch probably isn't the cleanest path to doing an http request.

I hate to x-y the problem, but I suspect that a good amount of folks finding this issue would be better served by importing web_sys crate features and implementing the fetch directly, something along the lines of:


use super::HTTPMethod;
use std::{collections::HashMap, future::Future, pin::Pin};
use tracing::{debug, error};
use urlencoding::encode;
use wasm_bindgen::prelude::*;
use wasm_bindgen::JsValue;
use wasm_bindgen_futures::JsFuture;
use web_sys::{Request, RequestInit, RequestMode, Response};

fn fetch_impl(
        method: HTTPMethod,
        path: String,
        params: HashMap<String, String>,
        headers: HashMap<String, String>,
        body: Option<String>,
        try_num: usize,
    ) -> Pin<Box<dyn Future<Output = Result<Response, JsValue>>>> {
        let mut encoded_params = vec![];
        for (key, val) in &params {
            encoded_params.push(format!("{}={}", encode(key), encode(val)));
        }
        let url = format!("{}?{}", path, encoded_params.join("&"));
        let mut opts = RequestInit::new();
        opts.method(method.to_str());
        opts.mode(RequestMode::Cors);
            let global: Global = js_sys::global().unchecked_into();
        let worker = global.unchecked_into();
        Box::pin(async move {
            match &body {
                None => {}
                Some(body) => {
                    opts.body(Some(&JsValue::from_str(body)));
                }
            }
            let request = Request::new_with_str_and_init(&url, &opts)?;
            let r_headers = request.headers();
            for (key, value) in &headers {
                r_headers.set(key, value)?;
            }
            match JsFuture::from(worker.fetch_with_request(&request)).await {
                Ok(resp_value) => {
                    debug!("Parsing {}", path);
                    // `resp_value` is a `Response` object.
                    assert!(resp_value.is_instance_of::<Response>());
                    let resp: Response = resp_value.dyn_into().unwrap();
                    Ok(resp)
                }
                Err(e) => {
                    error!("Request to {} failed: {:?}", path, e.as_string());
                    let try_num = try_num + 1;
                    if try_num < MAX_RETRIES {
                        error!("Retrying ({}/{})", try_num, MAX_RETRIES);
                        fetch_impl(method, path, params, headers, body, try_num).await
                    } else {
                        error!("Too many retries requesting {}, erroring", path);
                        Err(e).into()
                    }
                }
            }
        })
    }

(Pulled out of library code, but should be helpful in getting started).

seanmonstar commented 2 weeks ago

I'm not sure that's a fully accurate description. reqwest isn't specifically designed for a multi-threaded runtime. It's designed for any async runtime. The JavaScript runtime happens to fit quite well. So reqwest is able to wrap fetch.

This specific issue is not about whether reqwest should be using fetch. It's about an exception that is thrown under certain conditions. Let's keep it focused to that.

sxlijin commented 2 weeks ago

I've added this to my list of issues to keep an eye on, if I ever find time, but in the meantime, here's my mental sketch on the sequence of work that needs to happen:

  1. decide what the behavior/properties of the returned Response object from a no-cors wasm reqwest::method().send().await should be

    • per lilyball's excellent investigation in 2021 (3 years ago!), the behavior at that time was that setting no-cors on fetch changed browser behavior on the returned response type quite substantially.

      it's possible those behaviors have changed in the years since (not super likely, but it's possible). if so, not only will the implementer have to decide how to handle the new behavior, but they'll also need to decide if reqwest should support old behavior.

    • web_sys APIs i believe have different behavior in web workers / service workers than on a rendering thread. (iirc it doesn't have the same cors sandboxing that browsers enforce, and yes i realize how bizarre that sounds, so i might be very mistaken on this point.)

  2. some of these properties seem very fundamentally at odds with the standard expected behavior of Response, so the next challenge will be mapping those behaviors correctly

    • many wasm things violate standard expectations about rust safety. usually in reasonable ways, but in edge cases like this it can be very surprising (e.g. std::fs::read() compiles just fine on --target=wasm32-unknown-unknown, but panics if you ever try to call it)

    • example decision: if the response status is 0, that's not a valid reqwest::StatusCode, so what should response.status() actually do? is panicking tolerable in this situation?

    • (it probably makes sense to gate an initial implementation of this behind a cfg-flag, just because the behavior here seems surprising in a lot of ways)

  3. once all the design work is done, it will need to be implemented (this is honestly probably the easy part)

  4. wire it up with tests - .github/workflows/ci.yml currently runs wasm-pack headless tests with both chrome and firefox, and a quick skim suggests that tests/wasm_simple.rs is where the wasm-specific tests go?

sxlijin commented 2 weeks ago

Also, speaking as a user of reqwest whose codebase is shipped under 13+ different build configs, I am very grateful that reqwest provides a wasm implementation backed by fetch and that we don't have to roll our own HTTP/S abstraction types atop reqwest. I can't overstate how grateful I am for it.