seanmonstar / reqwest

An easy and powerful Rust HTTP Client
https://docs.rs/reqwest
Apache License 2.0
9.43k stars 1.05k forks source link

`no_gzip` & `gzip` not supported for wasm target #2073

Open Young-Flash opened 6 months ago

Young-Flash commented 6 months ago

Hi, thanks for the great work for reqwest. We use reqwest in Apache OpenDAL.

OpenDAL is building its wasm target support recently, there occurs an issue related to no_gzip & gzip, which reported at gdrive service can't work under wasm target.

I wonder is no_gzip & gzip support in reqwest wasm target possible for now? I'd like to help if there is possible to make it.

seanmonstar commented 6 months ago

I've done very little research, but in general: if it's something that can be supported with an option to fetch, that's an easy addition.

If not, I use these criteria to determine if we can add something for WASM:

Xuanwo commented 6 months ago

The browser handles the encoding process. Therefore, I assume our task is to set the appropriate accept-encoding header for fetch.

For example:

fetch('https://github.com/seanmonstar/reqwest/issues/2073', {
    headers: {
        'Accept-Encoding': 'identity'
    }
})
.then(response => {
    if (!response.ok) {
        throw new Error('Network response was not ok');
    }
    return response.text();
})
.then(text => {
    console.log(text);
})
.catch(error => {
    console.error('Fetch error: ', error);
});
Xuanwo commented 6 months ago

Therefore, I assume our task is to set the appropriate accept-encoding header for fetch.

Oh, accept-encoding doesn't take effect: the browser still sending gzip, deflate, br

gabrielgrant commented 6 months ago

yes, Accept-Encoding is one of the forbidden headers that can't be modified programmatically: https://developer.mozilla.org/en-US/docs/Glossary/Forbidden_header_name

As you've seen, it is set automatically by the browser based on its own capabilities. So not sure much can be done here, other than maybe erroring if trying to explicitly set any of these forbidden headers when on WASM?

seanmonstar commented 6 months ago

So then it seems there's nothing we can do, right?

Xuanwo commented 6 months ago

So then it seems there's nothing we can do, right?

We should enable gzip and brotli by default on the wasm32-unknown-unknown target to ensure proper handling of content-encoding. However, I'm not certain if zlib or brotli is available for this target.

gabrielgrant commented 6 months ago

@Xuanwo what do you mean by "enable by default"? I believe the issue is that these are (generally) used by default, regardless of whether they're set. do you mean it should somehow be reported that they were used?

My understanding is that the use of compression in the browser should be transparent to the application: the response should be decompressed by the browser automatically and returned to the application as though no compression was used (so I don't believe there should be any need for including zlib or brotli libraries, if that's what you're suggesting?)

Xuanwo commented 6 months ago

My understanding is that the use of compression in the browser should be transparent to the application: the response should be decompressed by the browser automatically and returned to the application as though no compression was used (so I don't believe there should be any need for including zlib or brotli libraries, if that's what you're suggesting?)

Thank you! Your comments have definitely cleared up my confusion about this. So we don't need to do anything here: we just need to avoid using incorrect content-length while content-encoding has a value.

gabrielgrant commented 6 months ago

Yes, I believe you're right that if the browser transparently applies compression then the reported content length (via header) will be the size of that compressed content, or may not even be sent at all, in the case of a chunked reply. In either case, it will not match the actual length of the content returned to the application

gabrielgrant commented 6 months ago

If reqwest currently expects that users can set gzip/brotli/deflate(/others?) compression scheme and that it should be handling that (de)compression on the data sent/received, that seems likely to be an bug/incorrect assumption in a browser/WASM environment. I don't know exactly how this works internally in reqwest/what the API looks like for this (does it dispatch to the relevant (de)compression function automatically if the Cargo feature is enabled and Content-Encoding is set?)

In any case, I think it likely makes sense to avoid this foot-gun by some combination of:

  1. noting in the docs that compression is automatically controlled by the browser on WASM
  2. not accepting the compression args (if those are separately exposed in the API somewhere? or just by setting headers?)
  3. disabling the optional compression-related Cargo features on WASM