dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.09k stars 4.7k forks source link

Sync API for HttpClient #32125

Closed ManickaP closed 4 years ago

ManickaP commented 4 years ago

Provide sync API on HttpClient, which currently has only async methods.

Motivation

Azure SDK currently exposes synchronous API which does sync-over-async on HttpClient (see HttpClientTransport.cs). It would be much better if we provided them with sync API with proper sync implementation, where feasible.

Also there are lots of existing code bases that have very deep synchronous call stacks. Developers are simply not willing to rewrite these code bases to be asynchronous. If they need to call async only API in these synchronous methods, they use sync-over-async, which then in turn causes "soft" deadlock. We want to provide synchronous APIs for these developers because synchronous APIs, although inefficient, can can help in avoiding these soft deadlocks.

Another advantage of sync API is that it is much easier to grasp. Especially for people with no prior knowledge of asynchronous processing. If someone is starting with HttpClient, they also have to be knowledgeable of C# async/await. Also many examples with `HttpClient might be simplified and thus making the entry into .NET easier.

Proposed API

Minimal Necessary Change

This change is based on @stephentoub prototype in https://github.com/stephentoub/corefx/commit/0e4d6401e76c79f8ef81b5064322b8de18a115a2. The prototype introduces synchronous API for SocketsHttpHandler and also properly implements it for HTTP 1.1 scenarios (for HTTP 2 does sync-over-async). This proposal extends the existing prototype with sync API on HttpClient. Following changes are a minimal set to achieve synchronous HttpClient.Send behaving trully synchronously at least for HTTP 1.1.

Note that CancellationToken is used in synchronous methods in order to propagate HttpClient timeout and cancellations. For HttpContent it's up for discussion whether to add other overload to CopyTo and SerializeToStream to match the async counterparts.

The prototype/early implementation for the minimal change can be found in my branch https://github.com/ManickaP/runtime/tree/mapichov/sync_http_api.

// The main classes where we want the sync API.
public partial class HttpMessageInvoker : IDisposable
{
    ...
    // Existing async method
    public virtual Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken);

    // Proposed new sync methods
    public virtual HttpResponseMessage Send(HttpRequestMessage request, CancellationToken cancellationToken);
}
public partial class HttpClient : HttpMessageInvoker
{
    ...
    // Existing async methods
    public Task<HttpResponseMessage> SendAsync(HttpRequestMessage request);
    public Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, HttpCompletionOption completionOption);
    public Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationToken cancellationToken);
    public override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken);

    // Proposed new sync methods
    public HttpResponseMessage Send(HttpRequestMessage request, HttpCompletionOption completionOption = default, CancellationToken cancellationToken = default);
    public override HttpResponseMessage Send(HttpRequestMessage request, CancellationToken cancellationToken);
}

// The changes bellow enable proper implementation of the sync API.

// HttpMessageHandler and derived classes
public abstract partial class HttpMessageHandler : IDisposable
{
    ...
    // Existing async method
    protected internal abstract Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken);

    // Proposed new sync method
    protected internal virtual HttpResponseMessage Send(HttpRequestMessage request, CancellationToken cancellationToken);
}
public abstract partial class DelegatingHandler : HttpMessageHandler
{
    ...
    // Proposed new sync method
    protected internal override HttpResponseMessage Send(HttpRequestMessage request, CancellationToken cancellationToken);
}
public abstract partial class HttpClientHandler : HttpMessageHandler
{
    ...
    // Proposed new sync method
    protected internal override HttpResponseMessage Send(HttpRequestMessage request, CancellationToken cancellationToken);
}
public abstract partial class SocketsHttpHandler : HttpMessageHandler
{
    ...
    // Proposed new sync method
    protected internal override HttpResponseMessage Send(HttpRequestMessage request, CancellationToken cancellationToken);
}
public abstract partial class MessageProcessingHandler : DelegatingHandler
{
    ...
    // Proposed new sync method
    protected internal sealed override HttpResponseMessage Send(HttpRequestMessage request, CancellationToken cancellationToken);
}

// HttpContent and derived classes
public abstract partial class HttpContent : IDisposable
{
    ...
    // Existing async methods
    public Task CopyToAsync(Stream stream);
    public Task CopyToAsync(Stream stream, TransportContext context);
    public Task CopyToAsync(Stream stream, TransportContext context, CancellationToken cancellationToken);
    public Task CopyToAsync(Stream stream, CancellationToken cancellationToken);

    protected abstract Task SerializeToStreamAsync(Stream stream, TransportContext context);
    protected virtual Task SerializeToStreamAsync(Stream stream, TransportContext context, CancellationToken cancellationToken);

    // Proposed new sync methods
    public void CopyTo(Stream stream, TransportContext context, CancellationToken cancellationToken);

    protected virtual void SerializeToStream(Stream stream, TransportContext context, CancellationToken cancellationToken);
}
public partial class ByteArrayContent : HttpContent
{
    ...
    // Proposed new sync method
    protected override void SerializeToStream(Stream stream, TransportContext context, CancellationToken cancellationToken);
}
public partial class MultipartContent : HttpContent, IEnumerable<HttpContent>, IEnumerable
{
    ...
    // Proposed new sync method
    protected override void SerializeToStream(Stream stream, TransportContext context, CancellationToken cancellationToken);
}
public sealed partial class ReadOnlyMemoryContent : HttpContent
{
    ...
    // Proposed new sync method
    protected override void SerializeToStream(Stream stream, TransportContext context, CancellationToken cancellationToken);
}
public partial class StreamContent : HttpContent
{
    ...
    // Proposed new sync method
    protected override void SerializeToStream(Stream stream, TransportContext context, CancellationToken cancellationToken);
}

Full Change

This includes sync counterparts to all async HttpClient methods. Since their implementation delegates to Send underneath, there's no need to add any other supporting sync methods. Note that all the sync methods on HttpClient (except for HttpMessageInvoker.Send override) do not need oveloads with CancellationToken. Whether to include them or not is up for discussion.

public partial class HttpClient : HttpMessageInvoker
{
    ...
    // Existing async methods
    public Task<HttpResponseMessage> DeleteAsync(string requestUri);
    public Task<HttpResponseMessage> DeleteAsync(string requestUri, CancellationToken cancellationToken);
    public Task<HttpResponseMessage> DeleteAsync(Uri requestUri);
    public Task<HttpResponseMessage> DeleteAsync(Uri requestUri, CancellationToken cancellationToken);
    public Task<HttpResponseMessage> GetAsync(string requestUri);
    public Task<HttpResponseMessage> GetAsync(string requestUri, HttpCompletionOption completionOption);
    public Task<HttpResponseMessage> GetAsync(string requestUri, HttpCompletionOption completionOption, CancellationToken cancellationToken);
    public Task<HttpResponseMessage> GetAsync(string requestUri, CancellationToken cancellationToken);
    public Task<HttpResponseMessage> GetAsync(Uri requestUri);
    public Task<HttpResponseMessage> GetAsync(Uri requestUri, HttpCompletionOption completionOption);
    public Task<HttpResponseMessage> GetAsync(Uri requestUri, HttpCompletionOption completionOption, CancellationToken cancellationToken);
    public Task<HttpResponseMessage> GetAsync(Uri requestUri, CancellationToken cancellationToken);
    public Task<byte[]> GetByteArrayAsync(string requestUri);
    public Task<byte[]> GetByteArrayAsync(string requestUri, CancellationToken cancellationToken);
    public Task<byte[]> GetByteArrayAsync(Uri requestUri);
    public Task<byte[]> GetByteArrayAsync(Uri requestUri, CancellationToken cancellationToken);
    public Task<Stream> GetStreamAsync(string requestUri);
    public Task<Stream> GetStreamAsync(string requestUri, CancellationToken cancellationToken);
    public Task<Stream> GetStreamAsync(Uri requestUri);
    public Task<Stream> GetStreamAsync(Uri requestUri, CancellationToken cancellationToken);
    public Task<string> GetStringAsync(string requestUri);
    public Task<string> GetStringAsync(string requestUri, CancellationToken cancellationToken);
    public Task<string> GetStringAsync(Uri requestUri);
    public Task<string> GetStringAsync(Uri requestUri, CancellationToken cancellationToken);
    public Task<HttpResponseMessage> PatchAsync(string requestUri, HttpContent content);
    public Task<HttpResponseMessage> PatchAsync(string requestUri, HttpContent content, CancellationToken cancellationToken);
    public Task<HttpResponseMessage> PatchAsync(Uri requestUri, HttpContent content);
    public Task<HttpResponseMessage> PatchAsync(Uri requestUri, HttpContent content, CancellationToken cancellationToken);
    public Task<HttpResponseMessage> PostAsync(string requestUri, HttpContent content);
    public Task<HttpResponseMessage> PostAsync(string requestUri, HttpContent content, CancellationToken cancellationToken);
    public Task<HttpResponseMessage> PostAsync(Uri requestUri, HttpContent content);
    public Task<HttpResponseMessage> PostAsync(Uri requestUri, HttpContent content, CancellationToken cancellationToken);
    public Task<HttpResponseMessage> PutAsync(string requestUri, HttpContent content);
    public Task<HttpResponseMessage> PutAsync(string requestUri, HttpContent content, CancellationToken cancellationToken);
    public Task<HttpResponseMessage> PutAsync(Uri requestUri, HttpContent content);
    public Task<HttpResponseMessage> PutAsync(Uri requestUri, HttpContent content, CancellationToken cancellationToken);

    // Proposed new sync methods
    public HttpResponseMessage Delete(string requestUri);
    public HttpResponseMessage Delete(string requestUri, CancellationToken cancellationToken);
    public HttpResponseMessage Delete(Uri requestUri);
    public HttpResponseMessage Delete(Uri requestUri, CancellationToken cancellationToken);
    public HttpResponseMessage Get(string requestUri);
    public HttpResponseMessage Get(string requestUri, HttpCompletionOption completionOption);
    public HttpResponseMessage Get(string requestUri, HttpCompletionOption completionOption, CancellationToken cancellationToken);
    public HttpResponseMessage Get(string requestUri, CancellationToken cancellationToken);
    public HttpResponseMessage Get(Uri requestUri);
    public HttpResponseMessage Get(Uri requestUri, HttpCompletionOption completionOption);
    public HttpResponseMessage Get(Uri requestUri, HttpCompletionOption completionOption, CancellationToken cancellationToken);
    public HttpResponseMessage Get(Uri requestUri, CancellationToken cancellationToken);
    public byte[] GetByteArray(string requestUri);
    public byte[] GetByteArray(string requestUri, CancellationToken cancellationToken);
    public byte[] GetByteArray(Uri requestUri);
    public byte[] GetByteArray(Uri requestUri, CancellationToken cancellationToken);
    public Stream GetStream(string requestUri);
    public Stream GetStream(string requestUri, CancellationToken cancellationToken);
    public Stream GetStream(Uri requestUri);
    public Stream GetStream(Uri requestUri, CancellationToken cancellationToken);
    public string GetString(string requestUri);
    public string GetString(string requestUri, CancellationToken cancellationToken);
    public string GetString(Uri requestUri);
    public string GetString(Uri requestUri, CancellationToken cancellationToken);
    public HttpResponseMessage Patch(string requestUri, HttpContent content);
    public HttpResponseMessage Patch(string requestUri, HttpContent content, CancellationToken cancellationToken);
    public HttpResponseMessage Patch(Uri requestUri, HttpContent content);
    public HttpResponseMessage Patch(Uri requestUri, HttpContent content, CancellationToken cancellationToken);
    public HttpResponseMessage Post(string requestUri, HttpContent content);
    public HttpResponseMessage Post(string requestUri, HttpContent content, CancellationToken cancellationToken);
    public HttpResponseMessage Post(Uri requestUri, HttpContent content);
    public HttpResponseMessage Post(Uri requestUri, HttpContent content, CancellationToken cancellationToken);
    public HttpResponseMessage Put(string requestUri, HttpContent content);
    public HttpResponseMessage Put(string requestUri, HttpContent content, CancellationToken cancellationToken);
    public HttpResponseMessage Put(Uri requestUri, HttpContent content);
    public HttpResponseMessage Put(Uri requestUri, HttpContent content, CancellationToken cancellationToken);
}

@stephentoub @KrzysztofCwalina @dotnet/ncl

antonfirsov commented 4 years ago

Do we have any existing synchronous API methods in .NET accepting CancellationToken?

alnikola commented 4 years ago

It would be better to clearly separate the public cancellationToken parameter purpose from HttpClient's own timeout which is set via Timeoutproperty. HttpClient's timeout will get a special handling logic once the PR #2281 is completed whereas the cancellationToken method parameter enables the caller to cancel request as they deem needed.

davidsh commented 4 years ago

Do we have any existing synchronous API methods in .NET accepting CancellationToken?

We do not. CancellationToken's are something that was introduced into .NET as part of adding async/await/Task APIs.

In general, we have avoided adding new sync APIs for I/O. But as described above, there is reason now to support sync API for broadly used APIs like HttpClient.

I am curious, though, how the CancellationToken parameter will be used in these APIs. Sync APIs by their nature are difficult to cancel.

Tratcher commented 4 years ago

I am curious, though, how the CancellationToken parameter will be used in these APIs. Sync APIs by their nature are difficult to cancel.

How is it any more difficult than implementing timeout support?

scalablecory commented 4 years ago

We will need to make the sync HTTP/1 implementation dispose the Socket, as the cancelable async methods won't be used. I think that's probably fine.

Tratcher commented 4 years ago

We will need to make the sync HTTP/1 implementation dispose the Socket, as the cancelable async methods won't be used. I think that's probably fine.

That approach has a long history of AVs. Have those been dealt with?

scalablecory commented 4 years ago

We will need to make the sync HTTP/1 implementation dispose the Socket, as the cancelable async methods won't be used. I think that's probably fine.

That approach has a long history of AVs. Have those been dealt with?

Does it? I hope so!

scalablecory commented 4 years ago

We should consider using default parameters here:

public Task<HttpResponseMessage> GetAsync(string requestUri);
public Task<HttpResponseMessage> GetAsync(string requestUri, HttpCompletionOption completionOption);
public Task<HttpResponseMessage> GetAsync(string requestUri, HttpCompletionOption completionOption, CancellationToken cancellationToken);
public Task<HttpResponseMessage> GetAsync(string requestUri, CancellationToken cancellationToken);

into

public Task<HttpResponseMessage> GetAsync(string requestUri, HttpCompletionOption completionOption = HttpCompletionOption.ResponseContentRead, CancellationToken cancellationToken = default);
ManickaP commented 4 years ago

Add CancelationToken: we do not need to add all the HttpClient sync methods with it, we need only the HttpMessageInvoker.Send overload.

Also the CancelationToken handles not just timeout but request cancellation as well (HttpClient.CancelPendingRequests).

Finally, in the prototype, the CancelationToken is regularly checked for cancellation wherever it cannot be passed further down in the call chain.

@alnikola: I've seen your PR, the timeout is still propagated by CancelationToken which would nicely work with the proposed solution.

karelz commented 4 years ago

Team discussion:

  1. Use default values in HttpClient.Send done
  2. We propose to NOT take the convenience overloads ("full change") - @stephentoub did you have some arguments for taking them?
  3. We are OK to take CancellationTokens for synchronous APIs - we should honor ALSO existing timeouts (e.g. HttpClient.Timeout and stream Timeouts)
    • UPDATE on 3/19 -- stream timeouts: We will do the Timeouts prototype separately from sync APIs. We need prototype to see the impact on perf and code complexity. Default value of timeouts should be infinite/-1 to avoid allocating timer/CancellationToken.
tmds commented 4 years ago

Is the motivation strong enough for adding sync methods?

C# has great support for async programming. And sync APIs means blocking threads...

MihaMarkic commented 4 years ago

One reason would be legacy code.

Drawaes commented 4 years ago

I think adding sync methods is a mistake and if upstream apis are exposing things as sync when there is underlying async that is their mistake to make, if you put these methods in I can see people falling into the trap of using them in a ecosystem that has embraced async. What is the scenario (outside of legacy) that you need a sync API these days. Servers are mostly async (Asp.net core being the most used in this space). UIs should definitely be using async

benaadams commented 4 years ago

They aren't sync; they are blocking apis; could they be prefixed or postfixed as such?

 public HttpResponseMessage BlockingGet(string requestUri);

or

 public HttpResponseMessage GetBlocking(string requestUri);
stephentoub commented 4 years ago

What is the scenario (outside of legacy) that you need a sync API these days.

Two:

  1. Simple use cases where the scalability of async isn't required, but more importantly...
  2. Consuming APIs that are sync, and where forcing sync-over-async is worse than just sync. This isn't just "legacy", e.g. brand new Azure SDK clients exposing sync methods.

I'll let @Petermarcu and @KrzysztofCwalina speak to Azure SDK, though it's just one example.

stephentoub commented 4 years ago

They aren't sync

How so?

stephentoub commented 4 years ago

We propose to NOT take the convenience overloads ("full change") - @stephentoub did you have some arguments for taking them?

I'm fine with that (though it hampers the "getting started" case).

davidfowl commented 4 years ago

How is HTTP/2 going to work?

stephentoub commented 4 years ago

How is HTTP/2 going to work?

There will be some blocking involved; that's unavoidable. HTTP/1.x should be sync all the way down.

Drawaes commented 4 years ago

I guess my main issue is now the first thing people will see is the "Send" method and until now async only methods on HttpClient has been a pit of success story, I fear that will change.

stephentoub commented 4 years ago

has been a pit of success story

That's exactly the problem... it hasn't been. Lots of call sites that require sync are forced to figure out how to block and then when they do they fall over at even small loads.

stephentoub commented 4 years ago

the first thing people will see is the "Send" method

I don't think it is. Forget the alphabetical ordering, we're talking about 1 method on HttpClient among more than 30. GetAync/PostAsync are much more likely to be the "first thing people will see".

We're simply making it possible for sync clients to not suck as much as they currently do. And sometimes you have no choice but to be sync.

jkotas commented 4 years ago

We're simply making it possible for sync clients to not suck as much as they currently do.

This API is a performance optimization for a specific use. Do we have any estimates to quantify the improvement we expect to get out this API for these specific use cases, e.g. microbenchmark that demonstrates the improvement?

stephentoub commented 4 years ago

This API is a performance optimization for a specific use. Do we have any estimates to quantify the improvement we expect to get out this API for these specific use cases, e.g. microbenchmark that demonstrates the improvement?

Yes. The key is it ends up requiring many more threads in order to sustain throughput. And it can take a long time for the thread pool to ramp up to the necessary level, while in the interim the system can essentially be deadlocked. That means either the app-level dev needs to explicitly set a high min thread count to force the ramp up, or we need to make the pool way more aggressive at thread injection, which harms other cases.

KrzysztofCwalina commented 4 years ago

I think adding sync methods is a mistake and if upstream apis are exposing things as sync when there is underlying async that is their mistake to make, if you put these methods in I can see people falling into the trap of using them in a ecosystem that has embraced async.

@Drawaes, I totally agree with this problem. We want to solve it by having associated analyzers that detect when the sync APIs are called in a body of an async method

jkotas commented 4 years ago

The key is it ends up requiring many more threads in order to sustain throughput.

Right, a net cost of waiting thread costs is ~10kB's of memory, let's make it 30kB. So this APIs allows you to save approximately 30kB * N memory where N is number of parallel pending network requests. If this is important to save this, it sounds good to me. Just want to throw some numbers into the discussions - all discussions about performance have to have numbers :-)

we need to make the pool way more aggressive at thread injection, which harms other cases.

We do not have to do this unconditionally. We can introduce a low-level API that let's threadpool know that the thread is stuck in sync-over-async and discount it in heuristics that decide whether to inject new threads. The advantage of doing this would be that it makes most sync-over-async problems better; the disadvantage is that you still burn 30kB of memory per instance so that it is not as good as the hand-crafted solution for the specific HttpClient case.

stephentoub commented 4 years ago

a net cost of waiting thread costs

It's not that simple. It's not just about the memory implications of threads sitting around waiting for something to do. It's also about what happens when they do have something to do. If I create the potential for a thousand threads just in case, and then a thousand work items get queued that end up doing computationally and memory-intensive work, I've now completely oversubscribed my system, just in case I needed to alleviate a potential deadlock.

This also requires the app developer to understand the implications of the libraries they're using, or the ones they're indirectly using, and configure the thread pool because some sync APIs are being used.

We can introduce a low-level API that let's threadpool know that the thread is stuck in sync-over-async and discount it in heuristics that decide whether to inject new threads.

I'm happy with doing that as well, assuming we can come up with the right, well-tuned heurstics. I don't think it's as easy as just pretending that the thread currently used for sync-over-async doesn't exist... code like for (int i = 0; i < 10_000; i++) TP.QueueUserWorkItem(_ => AsyncRequest().Wait()); will blow up if/as we try to create 10_000 threads, or may deadlock if we stop doing so.

davidfowl commented 4 years ago

There will be some blocking involved; that's unavoidable. HTTP/1.x should be sync all the way down.

Same with HTTP/3. I don't have a big issue with this change I just think the people thinking about it are thinking about HTTP/1 (the consumers). Also thinking about streaming scenarios is interesting...

Anyways 😄

stephentoub commented 4 years ago

I just think the people thinking about it are thinking about HTTP/1

Maybe some. As you can see in the first paragraph of the cited https://github.com/stephentoub/corefx/commit/0e4d6401e76c79f8ef81b5064322b8de18a115a2, I'm not one of them: "We could add such APIs and make the 95% HTTP/1.1 case work fully synchronously without terribly impacting the code base. But HTTP/2 is really hard."

KrzysztofCwalina commented 4 years ago

@ManickaP, I think it would be good to update the "motivation" section of this issue and add the main reason for this feature: there are lots of existing code bases that have very deep synchronous call stacks. Developers are simply not willing to rewrite these code bases to be asynchronous. If they need to call async only API in these synchronous methods, they use sync-over-async, which then in turn causes "soft" deadlock. We want to provide synchronous APIs for these developers because synchronous APIs, although inefficient, can can help in avoiding these soft deadlocks.

Of course, as you said usability is also a factor. We know that many developers prefer to prototype code using synchronous APIs. Synchronous APIs are better to prototyping. For example, they are easier to observe in the debugger.

jkotas commented 4 years ago

If I create the potential for a thousand threads just in case

There would be at most one extra thread created for each synchronously waiting thread created by the application. So this should be reducing maximum number of threads by 2 at most. If the application creates 1000 threads, it has a bad performance problem in the first place and this won't fix it (maybe just move the needle a little bit).

a thousand work items get queued that end up doing computationally and memory-intensive work

This is problem with async today as well. I do not think that the synchronous APIs like this should be advertised as a solution for the problem where CPU and memory-intensive work overwhelms the system and your I/O is not getting scheduled.

stephentoub commented 4 years ago

I think we're talking about different things. My point was that in order to avoid deadlocking / incredibly long ramp up times any time there's a spike in traffic, app developers directly or indirectly employing sync-over-async calls need to set a very high min thread count for the thread pool, but such a min count isn't restricted to just work items that perform sync-over-async, but rather everything that's queued to the pool.

benaadams commented 4 years ago

They aren't sync

How so?

Its unfortunate that we ended up with sync-vs-async as the terms. Synchronous is work done as same time; so the fast path of async, cpu bound tasks and other forms of IO that immediately return (data available).

.NET currently general has two forms of apis:

Both "Blocking" and "Async" are written as if they are synchronous code; whereas callback, non-blocking, event and push (Rx) need to be written differently, so are powerful representations because they look like everything else and don't unnecessarily consume cpu.

Calling "Async" "async" is fine as it describes the mechanism; however calling "Blocking" "synchronous" makes it sound more benign than it is; does not communicate the mechanism and more problematically suggests its something different than what it is.

So; to me, the phrases "sync-over-async" and "async-over-sync" while I know what the problems are, the problems are not immediately conveyed; and also raise the question is it problematic to do cpu-bound work (i.e. sync) in async method, which isn't the problem.

If the phrase was "blocking-over-async" and "async-over-blocking"; it much more immediately clear what the problem is (the waiting mechanisms are in conflict). This is also coupled with the issue that the "blocking" apis seem simpler both in name and usage (historical PR is on their side).

It additionally becomes confusing when talking about async completing immediately; which is the "sync" fast-path (which is true); but has nothing to do with the "blocking" apis which are also referred to as "sync".

I don't necessarily think I'll gain traction in changing the terminology used; but it is my hill... 😄

the first thing people will see is the "Send" method

I don't think it is. Forget the alphabetical ordering, we're talking about 1 method on HttpClient among more than 30.

I am happier with this restriction rather than providing "blocking" apis for everything

Drawaes commented 4 years ago

I can understand the "deep sync(avoiding the definition)" argument but then if you are moving a code base to .net core where these are missing maybe it's time to consider them as part of the move. Either way it's a valid case.

However the prototyping I am unsure of. If you make a console app with async Task as it's return type it's simply the await keyword, is there really a difference in that?

stephentoub commented 4 years ago

Its unfortunate that we ended up with sync-vs-async as the terms.

FWIW, I don't think it's unfortunate, but reasonable people can differ.

My use of sync, async, and sync-over-async is entirely about when and how results are returned. If you call a function and it returns the result to you, that's synchronous. If you call a function and it returns to you without yet providing the result, that's asynchronous (whether you explicitly write a callback or you use a keyword to let the compiler generate the callback for you or you poll). If you expose a sync method (it returns the answer synchronously), but you internally implement it by calling an asynchronous method and thus necessarily may block in order to wait for the answer to come, that's sync-over-async.

darrelmiller commented 4 years ago

It seems the default DelegatingHandler implementation of Send does a blocking call to SendAsync. Doesn't that mean that any app developer that starts trying to call Send with an existing DelegatingHandler will end up doing Blocking over Async and not realize it? Would it not be better to just throw a NotImplementedException if someone tries to call Send and the DelegatingHandler doesn't support Send.

stephentoub commented 4 years ago

It seems the default DelegatingHandler implementation of Send does a blocking call to SendAsync.

That should not be the case. DelegatingHandler.Send should call InnerHandler.Send.

What is true is that the base HttpMessageHandler.Send will call SendAsync and block waiting for it. All of the built-in HttpMessageHandler-derived types will override Send to do the "right thing", but other HttpMessageHandler-derived types will need to override Send, or else usage of HttpClient.Send will end up using their SendAsync implementations and blocking until they do.

darrelmiller commented 4 years ago

@stephentoub Yeah., my mistake, I was reading the wrong class. It does call innerHandler.Send. But if HttpMessageHandler then calls SendAsync and blocks, then the effect is the same. This seems like there is a risk that devs are going to think they are getting pure Sync behaviour and that won't happen if unmodified classes that derive from DelegatingHandler are present in the pipeline.

Drawaes commented 4 years ago

That also is a very interesting point. We use delegating handlers (custom) for service discovery and for metrics collections. So even if you have a non sync over async method it will need to call the SendAsync on those won't it?

stephentoub commented 4 years ago

This seems like there is a risk that devs are going to think they are getting pure Sync behaviour

Yup, it's one of the key concerns I previously called out. But I think the alternative of having HttpMessageHandler.Send throw NotSupportedException is worse.

itsmecurtis commented 4 years ago

the first thing people will see is the "Send" method

I don't think it is. Forget the alphabetical ordering, we're talking about 1 method on HttpClient among more than 30. GetAync/PostAsync are much more likely to be the "first thing people will see".

Could the new sync .Send be an explicit implementation of a new interface (IBlockingHttpClient?) so that it will be clearer that it supports a very narrow set of scenarios?

KrzysztofCwalina commented 4 years ago

I can understand the "deep sync(avoiding the definition)" argument but then if you are moving a code base to .net core where these are missing maybe it's time to consider them as part of the move. Either way it's a valid case.

We don't want to add friction for people who want to move to .NET Core.

However the prototyping I am unsure of. If you make a console app with async Task as it's return type it's simply the await keyword, is there really a difference in that?

The differences are small but non-trivial. We have quite consistent evidence from usability studies that the average .NET developer still struggles with async code. When code is already written and eveything works fine, async code looks simple and great. But for example, when people need to debug async code, the experience is not as great (and probably never will be), as I am sure you are aware.

Also, there is a list of very small things that people run into when we watch them writing async code. Speaking about your console app example: when you create a console app in VS, the template is a synchronous main. Many people don't actually know how to convert it to async main. They add the async keyword and forget to change the return value to Task.

jkotas commented 4 years ago

My point was that in order to avoid deadlocking / incredibly long ramp up times any time there's a spike in traffic, app developers directly or indirectly employing sync-over-async calls need to set a very high min thread count for the thread pool.

Yes, it is the case today. I believe that we can make it better by making the threadpool away of sync-over-async.

I am with @davidfowl that we need to see where we are going to end up eventually. It will take several years until this new synchronous API gets adoption. The cloud protocols are going to evolve in the meantime. For how long do we expect majority of the cloud / Azure to be on HTTP/1?

If the answer is that HTTP/1 started fading away or will start fading away soon, we need to start putting plan for HTTP/2 now to stay ahead of the curve. What would the plan for HTTP/2 or HTTP/3 be? Do the hard ugly rewrite? Or make threadpool aware of sync-over-async?

Drawaes commented 4 years ago

Speaking about your console app example: when you create a console app in VS, the template is a synchronous main.

Time to change the template?

stephentoub commented 4 years ago

I believe that we can make it better by making the threadpool away of sync-over-async.

I would like to understand the heuristics being proposed for that. I agree it'd be excellent to improve things there, I've just yet to see a solution that addresses the issue while not having degenerate behavior in not-terribly-unreasonable situations.

It will take several years until this new synchronous API gets adoption.

Azure SDK is planning to adopt it for when .NET 5 ships. They already ship synchronous methods.

What would the plan for HTTP/2 or HTTP/3 be? Do the hard ugly rewrite?

Currently blocking we do in the implementation could require up to N * 2 threads (for N requests). With a rewrite / thoughtfulness I expect we could get it to N + 1 threads (with a hard/ugly rewrite). I don't see any way to guarantee no more than N, in any implementation.

For how long do we expect majority of the cloud / Azure to be on HTTP/1?

@KrzysztofCwalina? @Petermarcu?

benaadams commented 4 years ago

What would the plan for HTTP/2 or HTTP/3 be? Do the hard ugly rewrite?

Currently blocking we do in the implementation could require up to N * 2 threads (for N requests). With a rewrite / thoughtfulness I expect we could get it to N + 1 threads (with a hard/ugly rewrite). I don't see any way to guarantee no more than N, in any implementation.

Aside https://github.com/stephentoub/corefx/commit/0e4d6401e76c79f8ef81b5064322b8de18a115a2 would only have an improvement on Windows for sync http/1.1? (as on Linux sockets don't use sync IO, even for "sync" calls)

stephentoub commented 4 years ago

as on Linux sockets don't use sync IO (event for sync calls)

This is not true. If you perform an asynchronous operation on a Socket, then yes, subsequent sync operations will actually be sync-over-async. But if you just do sync operations on a Socket, it's sync all the way down.

Even if it does, though, there's an important distinction: unblocking the waiting consumer doesn't require another work item be queued and run by the thread pool... the consumer is unblocked directly from the epoll thread. It doesn't result in the same N*2 issue that causes the thread pool scalability issues any more than normal truly-synchronous I/O.

KrzysztofCwalina commented 4 years ago

For how long do we expect majority of the cloud / Azure to be on HTTP/1?

There are currently no non HTTP/1 services and none planned. Services that need persistent connections and duplex communication use AMQP

benaadams commented 4 years ago

as on Linux sockets don't use sync IO (event for sync calls)

This is not true. If you perform an asynchronous operation on a Socket, then yes, subsequent sync operations will actually be sync-over-async. But if you just do sync operations on a Socket, it's sync all the way down.

Ah, my mistake, I thought it all went via epoll as it has handling for dealing with blocked events by signalling them rather than queuing to ThreadPool https://github.com/dotnet/runtime/blob/35a59c1d8debff1e62e7142b7bbaa22996ff689d/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs#L270-L278

stephentoub commented 4 years ago

Yeah, to do async we flip the socket to non-blocking, and there are potential race conditions that would result if we tried to flip it back, so if an async operation is performed, all subsequent sync operations are then emulated.