Floyddotnet commented 3 years ago

Sometimes the server lies to us and says that the response is gzip compressed, but it is not. This happens when the remote server is configured incorrectly.

this then leads to an System.IO.InvalidDataException: The archive entry was compressed using an unsupported compression method.

my suggestion is to check in advance if the content stream really starts with the signature 1F 8B 08 and if not, to treat it as a normal uncompressed stream.

the submission is to be understood as proof of concept and has not been tested yet!

how to reproduce this:

create an http response with the following content. don't compress the content!

HTTP/1.1 200
Content-Encoding: gzip
Content-Type: text/html;charset=utf-8
Content-Length: 4

test

Triage

How common is the situation?

E.g. if we google for the specific error message "ERR_CONTENT_DECODING_FAILED" with is the message chrome browser returns in this case we found over 17.200 entries.

In general, i think, this is a raw but not uncommon situation.

What is the source of the situation?

The are three main sources we can identify so far:

Wrong coded server apps which add the Content-Encoding header in any case even if they are returning plain text (sometimes normal requests work as expected but exceptions and errors are returned as plaintext)
Wrong configured load balancer
Wrong configured web servers

Possible solutions for this:

1. Check for magic bytes in the stream and fall back to uncompressed streams if they are missing

Pro	Contra
the caller does not need to change anything	works only with formats that have a magic byte sequence
existing code doesn't need to changed	the behavior of existing code changed because it does not throw an `InvalidDataException` if the case happens and can be corrected
	we must be careful to handle none seekable streams
	we add a small performance overhead to all requests
	we cannot distinguish between corrupted and uncompressed transmissions (however, corrupted transmissions should already occur in the transport layer)
	the caller is not informed about the correction (this could be a pro too)
	we need to consider whether this violates the RFC and if so decide whether we want it to

2. make the `DecompressionHandler` and `HttpMessageHandlerStage` public and remove the sealed modifier

Pro	Contra
The caller can inherit from `DecompressionHandler` and handle this case like he wants	as mentioned in comment 540707116 on the issue #24205 this could be a none non-trivial work
Makes it easy to add other compression methods for the application developer or handle other errors as well	shifts the problem of none seekable streams to the caller
No performance regression in regular cases	it's an api change
does not affect existing code	The caller needs to set `HttpClientHandler.AutomaticDecompression` to `None` before he can add his handler in the chain, if not its handler is not called

3. add extension points or a callback to the `HttpClientHandler` with is delegated to the `DecompressionHandler` where the caller can handle this case

Pro	Contra
No performance regression in regular cases if we find a proper solution for none seekable streams	we still need to find a proper solution for none seekable streams
Allows the application developer to add other compression methods	its an api change
`HttpClientHandler.AutomaticDecompression` works as expected and does not to be changed
does not affect existing code

4. throw a special exception with allows access to the underlying stream

Pro	Contra
the caller can handle this case	we need public available / accessible solution for none seekable streams
does not affect existing code if the exception inherits from `InvalidDataException`	log message changed if the case happens
	the use of exceptions for control flow is an anti-pattern but could maybe accepted in this case
	the rest of the handlers pipeline wont be executed

5. do nothing

Pro	Contra
no api change	the caller needs to copy the source of `DecompressionHandler` and set `HttpClientHandler.AutomaticDecompression` to `None` to handle this case
no performance regression	the developer can no longer benefit from bugfixes and new features in `DecompressionHandler`

Solution for the none seekable streams problem in 1-4

We could wrap the NetworkStream in a BufferedStream
We could define an internal CompositeStream or PrefixedStream with buffers the first bytes readen and then delegates to the NetworkStream transparently

dotnet-issue-labeler[bot] commented 3 years ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

ghost commented 3 years ago

Tagging subscribers to this area: @dotnet/area-system-io-compression See info in area-owners.md if you want to be subscribed.

Issue Details

Sometimes the server lies to us and says that the response is gzip compressed, but it is not. This happens when the remote server is configured incorrectly. this then leads to an `System.IO.InvalidDataException: The archive entry was compressed using an unsupported compression method.` my suggestion is to check in advance if the content stream really starts with the signature `1F 8B 08` and if not, to treat it as a normal uncompressed stream. the submission is to be understood as proof of concept and has not been tested yet ! ### how to reproduce this: create an http response with the following content. dont compress the content! ``` HTTP/1.1 200 Content-Encoding: gzip Content-Type: text/html;charset=utf-8 Content-Length: 4 test ``` @dotnet/ncl

Author:	Floyddotnet
Assignees:	-
Labels:	`area-System.IO.Compression`, `untriaged`
Milestone:	-

ghost commented 3 years ago

Tagging subscribers to this area: @dotnet/ncl See info in area-owners.md if you want to be subscribed.

Issue Details

Sometimes the server lies to us and says that the response is gzip compressed, but it is not. This happens when the remote server is configured incorrectly. this then leads to an `System.IO.InvalidDataException: The archive entry was compressed using an unsupported compression method.` my suggestion is to check in advance if the content stream really starts with the signature `1F 8B 08` and if not, to treat it as a normal uncompressed stream. the submission is to be understood as proof of concept and has not been tested yet ! ### how to reproduce this: create an http response with the following content. dont compress the content! ``` HTTP/1.1 200 Content-Encoding: gzip Content-Type: text/html;charset=utf-8 Content-Length: 4 test ``` @dotnet/ncl

Author:	Floyddotnet
Assignees:	-
Labels:	`area-System.Net.Http`, `untriaged`
Milestone:	-

stephentoub commented 3 years ago

Per https://github.com/dotnet/runtime/pull/58567#discussion_r701238462, any action here would likely just be updating a resource string in System.IO.Compression.

Floyddotnet commented 3 years ago

Per #58567 (comment), any action here would likely just be updating a resource string in System.IO.Compression.

Please reopen the Issue #24205

As you suggest in #58567 (comment) i try to create an instace of DecompressionHandler in my DelegatingHandler. The DelegatingHandler needs a instance of HttpMessageHandlerStage with is internal. I can creating a wrapping Handler around this and must copy the whole DecompressionHandler source. But then i cant benefit from any bugfixes and features you made in the future.

I see following possible solutions:

make the DecompressionHandler and HttpMessageHandlerStage public so i can inherit from them.
add extensionpoints to the HttpClientHandler
throw an custom exception like InvalidContentStreamException:InvalidDataException(string mesage, Stream underlyingStream) with allow access to the underlying stream in case of any error during parsing the stream because this exception inherits from InvalidDataException this isn`t a breaking change in my opinion

karelz commented 3 years ago

We should not reopen #24205. It was closed 2 years ago. What has changed since it was closed is that we have 1 more upvote (in last 2 years), therefore the original reasoning for its closure still stands.

Floyddotnet commented 3 years ago

We should not reopen #24205. It was closed 2 years ago. What has changed since it was closed is that we have 1 more upvote (in last 2 years), therefore the original reasoning for its closure still stands.

because it's closed there are no upvotes, because there are no upvotes it stays closed?

What did you think about my posted solutions? or does the discussion end with: "bad luck" / "not solvable" / "wont be solved in any way"

Floyddotnet commented 3 years ago

Please don't get me wrong. I think the .net framework is great and I admire your work. But this attitude "it can't be what can't be" I already know from another big opensource community and I find it frustrating and disheartening.

there are people who have a real problem. there are easy ways to solve the problem, but everything is blocked with "private", "internal" or "sealed". then you offer a solution and it is rejected. ok. you offer more solutions and they are also rejected and in the end you stand alone in the rain.

i would be willing to contribute as soon as i know which way would be preferred by you.

karelz commented 3 years ago

@Floyddotnet

because it's closed there are no upvotes, because there are no upvotes it stays closed?

We can edit the number of duplicated reports, or if there is suddenly spike in interest, another issue (like this one) can be upvoted and connected to the old one. If we reopen the old one now, the key question is: What has changed since the last time we closed it? Note that we are keeping this one open for another round of triage. I am just suggesting to not reopen the old one, that is all.

What did you think about my posted solutions? or does the discussion end with: "bad luck" / "not solvable" / "wont be solved in any way"

I didn't read your full discussion with @stephentoub on the PR yet. I am not closing this issue. We can consider alternative solutions and think about how common this case is. I have my personal opinions, but will leave it to triage discussion (and when I fully understand what is needed here).

But this attitude "it can't be what can't be"

I think you are jumping to conclusions a bit here. What could help the discussion is to pause a bit, think about the problem, the scenario (how common is it? How many customers does it impact?), and consider / list ALL potential solutions (incl. doing nothing). We will do that as part of triage. Also, it seems that there was a solution suggested above in https://github.com/dotnet/runtime/issues/58568#issuecomment-912035590, but your next comment https://github.com/dotnet/runtime/issues/58568#issuecomment-912352158 doesn't consider it and jumps to different / more general solution, which was in the past rejected from fairly good reasons (which I pointed out). IMO we have options to discover alternatives, or constructively discuss why previous closing was wrong / insufficient. Let's do it here before we rush into action / accusations. What do you think?

there are people who have a real problem.

Yes, as I mentioned, so far we encountered 3 people hitting the problem in 3 years. Is that fair? Do you think there are more? Great, let's talk about that -- how common is the scenario? (and is this the only / best solution)

there are easy ways to solve the problem, but everything is blocked with "private", "internal" or "sealed".

Extensibility is something that needs to be designed. Carefully. Just flipping everything to public is road to hell and did bite us in the past way too often. Suddenly it limits your ability to change things later, optimize things, etc. So, while it may seem as "easy" solution, believe me, it is not.

then you offer a solution and it is rejected. ok. you offer more solutions and they are also rejected and in the end you stand alone in the rain.

In the context of this issue, there are only 2 solutions above -- 1 suggested by Stephen, 1 suggested by you. The discussion didn't even start here on this issue. So, why do you feel you offered more solutions and that they were rejected? Is it because of your discussion on your Draft PR? Did you feel you were ignored there, or was it the case that more things were pointed out in the context to be considered? Perhaps we should step back, think about the scenario (and similar scenarios) and list potential solutions that would fit (from general ones to very specific ones).

i would be willing to contribute as soon as i know which way would be preferred by you.

Thanks for the offer! Sadly, the hard part in issues like this is the design -- it requires careful understanding of the scenario and how it fits intended design of available APIs. Then it requires finding the "right" solution (incl. considering its impact on other scenarios and devs). Only then we can start talking about contributions. Getting to consensus and understanding requirements, needs, options sometimes takes time. Not because people are lazy, but because everyone should be able to understand the needs, current limitations and mainly impact of suggested solutions on long-term maintenance and on other scenarios. So, let's start with that and hopefully we will get to some mutually agreeable outcome. What do you think?

Floyddotnet commented 3 years ago

Thank you for the detailed reply. I agree with many points, but not with all.

Also, it seems that there was a solution suggested above in #58568 (comment), but your next comment #58568 (comment) doesn't consider it and jumps to different / more general solution

That's wrong. stephentoub commentar refers to mine at: #58567(comment)

but then we could just return a meaningful error instead of the wrong error message "The archive entry was compressed with an unsupported compression method."

in #58568 (comment) i didn't go into this because we already agreed on this improvement. sorry for the ambiguity. but a better error text doesn't solve the problem. so i didn't see it as a solution, only as an improvement.

that's why i suggested other solutions in the same comment and put them up for discussion:

I see following possible solutions:

make the DecompressionHandler and HttpMessageHandlerStage public so i can inherit from them.

add extensionpoints to the HttpClientHandler

throw a custom exception like InvalidContentStreamException:InvalidDataException(string mesage, Stream underlyingStream) with allow access to the underlying stream in case of any error during parsing the stream because this exception inherits from InvalidDataException this isn`t a breaking change in my opinion

Yes, as I mentioned, so far we encountered 3 people hitting the problem in 3 years. Is that fair? Do you think there are more? Great, let's talk about that -- how common is the scenario? (and is this the only / best solution)

well, only 3 people reported that they have this problem. all others probably just turned off the compression and ignored the problem. because if we look at it closely, any solution that would lead to an api change we find now will be available in 1 to 2 years at the earliest with the next .net release or the one after that. for many this is too long. so from 3 reports you can't conclude the number of affected people.

Extensibility is something that needs to be designed. Carefully.

I'm full agree with this

In the context of this issue, there are only 2 solutions above -- 1 suggested by Stephen, 1 suggested by you. As I described above, changing the error text is not a solution to the problem.

If we reopen the old one now, the key question is: What has changed since the last time we closed it?

I just wanted to add a new use case for this feature (by referencing this issue) so that we/you can better keep track of what this feature is used for.

Feel free to discuss further about possible solutions. in what format should I present my above approaches so that they are more visible and can be included in the triage. should I include them via edit in the first post?

also i can try to describe the usecase a bit better. what specifically is unclear? i can think of more use cases for issuse #24205 (e.g. adding more compression methods) too.

karelz commented 3 years ago

That's wrong. stephentoub commentar refers to mine

As I said, I didn't read everything. At minimum, this issue lacks some context / summary of what was discussed elsewhere. The fact the discussion is / was happening in 2 places is not helping ...

well, only 3 people reported that they have this problem. all others probably just turned off the compression and ignored the problem. because if we look at it closely, any solution that would lead to an api change we find now will be available in 1 to 2 years at the earliest with the next .net release or the one after that. for many this is too long. so from 3 reports you can't conclude the number of affected people.

I will disagree here. Bunch of APIs have many more upvotes. While not everyone will comment or upvote, many folks will. Therefore you can treat it as a sampling method. If only 3 people cared enough to raise it, it doesn't mean only 3 people in the world hit it, but it is proportional to other issues in this repo. <5 is fairly low number, especially spanned over 4 years. That said, if you have other source of how common this issue is, we can definitely consider that. E.g. StackOverflow questions, etc.

I just wanted to add a new use case for this feature (by referencing this issue) so that we/you can better keep track of what this feature is used for.

Understood. We can do it later -- we have the rights to modify / comment on locked issues.

Feel free to discuss further about possible solutions. in what format should I present my above approaches so that they are more visible and can be included in the triage. should I include them via edit in the first post?

I think it would be best to keep top post for scenario description -- which it is. And we can keep solutions in replies like the one you pasted. Are there other alternatives? For example, with a little bit of code-copying, would you be able to solve the problem in your app?

also i can try to describe the usecase a bit better. what specifically is unclear?

Your original use case is fairly clear. The key question (at least for me) is how common the situation is? What is the server that does it? Is it perhaps a bug in the server? Was it fixed? How popular is the server? Or is it problem of the server-app? Any particular code pattern that does it?

Overall, we were hesitant in the past to accommodate for weird, buggy non-mainstream scenarios going against RFC. On practical side, if something is common and it becomes compatibility problem, we are fine to go beyond RFC with good reasons. Question is: Is your scenario the case? Hopefully we have agreement that sticking primarily to RFC is a good thing. Too much flexibility and violations lead to overall ecosystem problems.

i can think of more use cases for issuse #24205 (e.g. adding more compression methods) too.

Understood. It is easy to imagine "what if" scenarios, however it is road to hell as one can easily overengineer something that won't be used almost at all. Therefore, what we are trying to focus on are practical scenarios - things people hit in the real-world code. Like yours. Does it make sense?

Floyddotnet commented 3 years ago

Is it perhaps a bug in the server? Was it fixed? How popular is the server? Or is it problem of the server-app? Any particular code pattern that does it?

What i can tell you about the server is that it is an apacha server. After searching around a bit if figure out that this problem often happens in 2 cases.

apacha or nginx are used as a reverse proxy is configured wrong to communicate with the backend (see Solving reverse proxy error ERR_CONTENT_DECODING_FAILED) or the content-encoding is fixed set in the server config as an additional header
a server-app is coded wrong, set the Content-Encoding but doesn't encode the body for example a node.js app:

var zlib = require('zlib');
var http = require('http');
var fs = require('fs');
http.createServer(function(request, response) {
  var raw = fs.createReadStream('index.html');
  var acceptEncoding = request.headers['accept-encoding'] || '';

  if (acceptEncoding.match(/\bgzip\b/)) {
    response.writeHead(200, { 'content-encoding': 'gzip' });
    // raw.pipe(zlib.createGzip()).pipe(response); //<!-- this would be right
    raw.pipe(response); //this is wrong
  } else {
    response.writeHead(200, { 'content-encoding': 'gzip' });
    raw.pipe(response);
  }
}).listen(1337);

background story:

in my case I work with apis from companies of any size and knowledge with provides financial and insurance services or directly with insurance companies. the small companies usually cause the biggest problems. they outsource the development and often need months for simple adjustments because they have to pay for everything themselves. or they develop themselves and often have little experience (in development but in their field) or employ one or two young, inexperienced developers.

in my special case the api always behaves wrong when a statuscode from the range 400-499 is returned. the error message is then in plain text in the body but content-encoding is set to gzip. i have now circumvented the problem by turning off compression completely. nevertheless i would like to try to help solve the problem in principle.

Floyddotnet commented 3 years ago

That said, if you have other source of how common this issue is, we can definitely consider that. E.g. StackOverflow questions, etc.

If I google for the specific error message "ERR_CONTENT_DECODING_FAILED" with is the message chrome returns in this case I found over 17.200 entries.

https://stackoverflow.com/questions/14039804/error-330-neterr-content-decoding-failed (with upvotes from every year since the last 8 years) https://superuser.com/questions/172951/chrome-error-330-neterr-content-decoding-failed

and many block-posts and forum entries outside of the stackoverflow universe:

https://www.itsfullofstars.de/2018/04/solving-reverse-proxy-error-err_content_decoding_failed/ http://windowsbulletin.com/how-to-fix-err_content_decoding_failed/ https://appuals.com/how-to-fix-err_content_decoding_failed-error/ https://www.howtoforge.com/firefox-content-encoding-error-google-chrome-error-330-net-err_content_decoding_failed-unknown-error

and more if i search for this error in general:

https://github.com/grafana/k6/issues/1030

Overall, we were hesitant in the past to accommodate for weird, buggy non-mainstream scenarios going against RFC. On practical side, if something is common and it becomes compatibility problem, we are fine to go beyond RFC with good reasons. Question is: Is your scenario the case? Hopefully we have an agreement that sticking primarily to RFC is a good thing. Too much flexibility and violations lead to overall ecosystem problems.

i agree with this. the problem is, that if a case happens in the real world, the developer should be able to bypass and fix this in his application without reimplementing the would wheel again. did you agree with this in general?

Understood. It is easy to imagine "what if" scenarios, however it is road to hell as one can easily overengineer something that won't be used almost at all. Therefore, what we are trying to focus on are practical scenarios - things people hit in the real-world code. Like yours. Does it make sense?

yes, agree with this in general. i will point to the Open-Closed Prinzip "Modules should be both open (for extension) and closed (for modification)." an internal sealed class in general violates this intention to add new extensions and features.

For example, with a little bit of code-copying, would you be able to solve the problem in your app?

maybe yes but I wouldn't call it a little bit :-) because all of these classes are internal or sealed i must copy DecompressionHandler (435 lines), HttpMessageHandlerStage, and the SocketsHttpHandler (645 lines). maybe I can skip the last one.

Floyddotnet commented 3 years ago

@karelz as you suggested, i have expanded my initial post, put together all the solutions i can think of and the ones you mentioned, and listed the pros and cons i can imagine in each case

scalablecory commented 2 years ago

This is too fragile to make automatic -- we would not catch some cases where a server sends corrupt content -- and appears too niche to add an explicit setting for.

What I could sign on for is making sure all the API surface is needed for you to be able to do this yourself, though. I think this is possible today by making a custom HttpRequestHandler that uses GzipStream and so on. Can you try this and see? If so, I'd love to see it.

Floyddotnet commented 2 years ago

Hi scalablecory, thank you for your response.

Yes i think this is possible. I need to disable HttpClientHandler.AutomaticDecompression and copy the source of DecompressionHandler because he is internal sealed.

I will work on it in the next few days and post the result here.

A little problematic I see the following points:

copying the code of DecompressionHandler cuts you off from any enhancements that are made later in the original
if at some point it becomes necessary to change the order of the handlers in SocketsHttpHandler.SetupHandlerChain or to add a new handler between https://github.com/dotnet/runtime/blob/5ba59174ab57ff61b2be4b0da89f1a192a03a826/src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/SocketsHttpHandler.cs#L520-L523 and https://github.com/dotnet/runtime/blob/5ba59174ab57ff61b2be4b0da89f1a192a03a826/src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/SocketsHttpHandler.cs#L525-L526 so that DecompressionHandler isn't the last handler in the not changeable default chain, then we are again faced with the same problem

scalablecory commented 2 years ago

I will work on it in the next few days and post the result here.

Thank you, this seems like the best next step.

copying the code of DecompressionHandler cuts you off from any enhancements that are made later in the original

It might actually be better this way.

To have a stable workaround against a buggy server, it's generally going to be in your best interests to remove as many variables as possible and scope your workaround to what you know works. You don't know how the server will react to future enhancements to DecompressionHandler -- it's a good variable to remove.

Because this workaround has potential to hide bugs, I might even implement this outside of a handler and against only specific URLs rather than all of them.

if at some point it becomes necessary to change the order of the handlers in SocketsHttpHandler.SetupHandlerChain or to add a new handler between

Can you rephrase this in terms of public API behavior? That handler chain is just an implementation detail and it's hard to reason about it in the same terms. If you implement a workaround on top of HttpClient today, I'd expect it to continue to work regardless of changes there.

Floyddotnet commented 2 years ago

This is a working implementation.

The ContentStream is copied to a MemoryStream and then worked with. I am still working on a version with a PrefixedReadOnlyStream.

With this variant the build in DecompressionHandler is still used for brotli and deflate.

[TestFixture]
public class Issue58568
{
  public class MyDecompressionHandler : DelegatingHandler
  {
    private static readonly byte[] s_gzipSignature = new byte[] { 0x1F, 0x8B, 0x08 };

    public MyDecompressionHandler(HttpMessageHandler innerHandler) : base(innerHandler)
    {
    }

    protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, System.Threading.CancellationToken cancellationToken)
    {
      request.Headers.AcceptEncoding.Add(new StringWithQualityHeaderValue("gzip"));

      var response = await base.SendAsync(request, cancellationToken).ConfigureAwait(false);

      if (response.Content.Headers.ContentEncoding.First() == "gzip")
      {
        await using var s = await response.Content.ReadAsStreamAsync(cancellationToken).ConfigureAwait(false);

        var ms = new MemoryStream();
        await s.CopyToAsync(ms, cancellationToken).ConfigureAwait(false);
        ms.Seek(0, SeekOrigin.Begin);

        response.Content = CheckContentSignature(ms, s_gzipSignature) 
          ? new StreamContent(new GZipStream(ms, CompressionMode.Decompress))
          : new StreamContent(ms);
      }

      return response;
    }

    protected bool CheckContentSignature(Stream s, byte[] signature)
    {
      var position = s.Position;
      var buffer = ArrayPool<byte>.Shared.Rent(signature.Length);
      try
      {
        if (s.Length < signature.Length)
        {
          return false;
        }

        s.Read(buffer);

        s.Position = position;

        for (int i = 0; i < signature.Length; i++)
        {
          if (signature[i] != buffer[i]) 
            return false;
        }

        return true;
      }
      finally
      {
        ArrayPool<byte>.Shared.Return(buffer);
      }
    }
  }

  private IHost _host;

  [OneTimeSetUp]
  public async Task Setup()
  {
    _host = await new HostBuilder()
      .ConfigureWebHost(webBuilder => webBuilder
        .UseKestrel(options => options.ListenAnyIP(5550))
        .ConfigureServices(collection => collection.AddRouting())
        .Configure(app => app
          .UseRouting()
          .UseEndpoints(e =>
          {
            e.MapGet("/gzip", c =>
            {
              c.Response.Headers.Add("Content-Encoding", "gzip");
              return c.Response.Body.WriteAsync(Convert.FromBase64String("H4sIAAAAAAAA//NIzcnJVyjPL8pJUQQAlRmFGwwAAAA=")).AsTask();
            });

            e.MapGet("/plain", c =>
            {
              c.Response.Headers.Add("Content-Encoding", "gzip");
              return c.Response.WriteAsync("Hello world!");
            });
          })))
      .StartAsync();
  }

  [TestCase("/gzip")]
  [TestCase("/plain")]
  public async Task Test(string endpoint)
  {
    var client = new HttpClient(new MyDecompressionHandler(new HttpClientHandler(){AutomaticDecompression = DecompressionMethods.Brotli | DecompressionMethods.Deflate}));

    var response = await client.GetAsync("http://localhost:5550" + endpoint);

    Assert.AreEqual("Hello world!", await response.Content.ReadAsStringAsync());
  }
}

Can you rephrase this in terms of public API behavior? That handler chain is just an implementation detail and it's hard to reason about it in the same terms. If you implement a workaround on top of HttpClient today, I'd expect it to continue to work regardless of changes there.

Yes, of course. I just wanted to point out that as soon as it should be necessary after a new Http standard version e.g. to make decisions based on the decompressed stream, and these decisions are processed within the default handler chain, this workaround does not work completely anymore.

One would then have to take this new logic into account in the workaround as well or create additional workaround handlers.

All because the default DecompressionHandler cannot be overwritten, replaced or extended.

A hypothetical HandlerChain of an HttpClient:

// from SetupHandlerChain
1. HttpConnectionHandler / HttpAuthenticatedConnectionHandler
2. DiagnosticsHandler
3. RedirectHandler
4. DecompressionHandler
5. NewBuildInHandler // <!-- this could make trubble
-------------
// from User / HttpClient constructor
6. MyDecompressionHandler

scalablecory commented 2 years ago

Thanks. This looks like the correct strategy to me.

Re: the handler chain, I don't think this is something to worry about. We do explicitly support the no automatic decompression scenario, to allow the caller to decompress manually, so this isn't something we'll break. Any handler we add after decompression would likely depend on the content being uncompressed and thus be turned off if auto-decompression was disabled, so you wouldn't notice any difference.

scalablecory commented 2 years ago

Since a clean workaround is available and this is a niche case where any change by us would result in unsafe behavior, this isn't something we're interested in acting on.

Closing out the issue. If you do find a case where a workaround is not possible, please do open a new one.

Floyddotnet commented 2 years ago

I can accept that it's not wanted to add a workaround directly into the DecompressionHandler.

Re: the handler chain, I don't think this is something to worry about. We do explicitly support the no automatic decompression scenario, to allow the caller to decompress manually, so this isn't something we'll break. Any handler we add after decompression would likely depend on the content being uncompressed and thus be turned off if auto-decompression was disabled, so you wouldn't notice any difference.

as you can see in the workaround, automatic decompression is NOT disabled. it is only bypassed for gzip. also, such a new handler would already cause problems.

Since a clean workaround is available and this is a niche case where any change by us would result in unsafe behavior, this isn't something we're interested in acting on.

i don't quite understand where the reluctance to make either of the following changes comes from and would appreciate an explanation.

Add DecompressionHandler as a property to HttpClientHandler (new HttpClientHandler(){ DecompressionHandler = new MyDecompressionHandler() })
Make DecompressionHandler public

I am a friend of clean solutions

dotnet / runtime

handle that content-encoding can lie #58568

how to reproduce this:

Triage

How common is the situation?

What is the source of the situation?

Possible solutions for this:

1. Check for magic bytes in the stream and fall back to uncompressed streams if they are missing

2. make the `DecompressionHandler` and `HttpMessageHandlerStage` public and remove the sealed modifier

3. add extension points or a callback to the `HttpClientHandler` with is delegated to the `DecompressionHandler` where the caller can handle this case

4. throw a special exception with allows access to the underlying stream

5. do nothing

Solution for the none seekable streams problem in 1-4

dotnet / runtime

handle that content-encoding can lie #58568

how to reproduce this:

Triage

How common is the situation?

What is the source of the situation?

Possible solutions for this:

1. Check for magic bytes in the stream and fall back to uncompressed streams if they are missing

2. make the DecompressionHandler and HttpMessageHandlerStage public and remove the sealed modifier

3. add extension points or a callback to the HttpClientHandler with is delegated to the DecompressionHandler where the caller can handle this case

4. throw a special exception with allows access to the underlying stream

5. do nothing

Solution for the none seekable streams problem in 1-4

2. make the `DecompressionHandler` and `HttpMessageHandlerStage` public and remove the sealed modifier

3. add extension points or a callback to the `HttpClientHandler` with is delegated to the `DecompressionHandler` where the caller can handle this case