fsprojects / FSharp.Data

F# Data: Library for Data Access
https://fsprojects.github.io/FSharp.Data
Other
816 stars 288 forks source link

Use System.Net.Http.HttpClient #1392

Open Tarmil opened 3 years ago

Tarmil commented 3 years ago

This was already proposed in #93, but given how much the landscape has changed since 2013, I thought it better to create a new issue.

The specific reason why I'm proposing it right now is to be able to use FSharp.Data in WebAssembly with Bolero; see this issue. But more generally I think it would be good to update to the modern and recommended library.

cartermp commented 3 years ago

Agreed! Would gladly accept a pull request here.

AlexGagne commented 3 years ago

Hey, I'm new here but I figured I'd give a go at this. I encountered an issue, but I'm not sure how to continue.

HttpClient in .netstandard 2.0 has issues that were fixed in later versions. Basically, you have two problems:

This was fixed in .Net Core 2.1 and later using SocketsHttpHandler. The solution is then to use a single HttpClient using SocketsHttpHandler, which avoids the stale DNS issue. To use this in FSharp.Data, the framework of this library would have to be upgraded and it will no longer support .net framework (since .netstandard 2.0 is the latest version that supports .net framework).

I've come to the following solutions and I'm not sure which I should follow:

  1. Upgrade .Net and stop supporting .netframework
  2. Choose either "bad" solution. Maybe add a way for the user of the library to specify which "problem" they want to handle?
  3. Add some kind of solution that periodically refreshes the HttpClient to avoid the stale DNS issue.
  4. I've found backports of SocketsHttpHandler to .netstandard 2.0. They are available in nuget packages, but I'm wary of adding dependencies to the project.
  5. Write a backport for .netstandards 2.0. The implementation looks fairly complex and is probably error-prone.

I'm willing to do 1-4, but I'm less sure about how to implement 3. 5 looks a little too complex for my liking. I'm open for other solutions as well, if you can find others.

I think it would be a good idea to have a maintainer pitch in.

aaronmu commented 3 years ago

I don't want to add an IWhatever using the AddHttpClient extension method on IServiceCollection. I want to send a flippin' web request without shooting myself in the foot. The current version of FSharp.Data is awesome at that. I really hope we don't lose the ability to send a web request in a single line of code!

cartermp commented 3 years ago

@aaronmu this is about replacing internals of FSharp.Data, not what you interface with.

AlexGagne commented 3 years ago

@cartermp Do you have an opinion on my comment? I would like to work on adding the HttpClient, but it requires a change that could be breaking and I'd like to know a maintainer's opinion on the subject

cartermp commented 3 years ago

This was fixed in .Net Core 2.1 and later using SocketsHttpHandler

This makes me scream "UGH" into a pillow.

I don't think we're at the point where we can drop support for .NET Standard 2.0 in this package. I wish it were so, because then this would be much simpler.

Since I don't work at Microsoft anymore I can't get stats on relative usage of .NET FX with F# vs. .NET. I know that with .NET 5 there were more F# developers targeting modern .NET than .NET Framework, and so maybe that continues enough to where we can cut another major version and break compat.

@dsyme what do you think? And could you possibly look into usage from Visual Studio to see?

dsyme commented 3 years ago

Could we #if NETSTANDARD21 the use of HttpClient and have both netstandard2.0 and netstandard2.1 tfms?

cartermp commented 3 years ago

We could, but that would lead to either:

I'm really not happy with either approach, but I'd pick the latter if it came down to it.

ordinaryorange commented 2 years ago

I just had a deep dive into this, naively thinking this might no be too hard to do if we were happy with multiple TFM's @AlexGagne your deductions on the .net standard 2.0 issues agreed with my research. @cartermp keep that pillow close...the pain keeps coming.

So I thought I'd have a crack at a possible implementation (as I just hit needing FSharp.Data in blazor too) @Tarmil I had a look at your attempt years back for inspiration, but it appears to be a short term fix for your needs at the time ?

Now I'm a fair noob with HttpClient not having used it much & relying in FSharp.Data for all my http work to date, so I could be misguided in my summations here.

TLDR - IMO, To make proper use of httpClient (say in terms of socket reuse) the public API of FSharp.Data.Http really needs a restructure. It is somewhat possible to integrate HttpClient without changing much of the public API but we will have to give up some of the goodness HttpClient offers, and make a bunch of assumptions for the caller, which just feels awful.

To avoid tuning this into an essay, here are some dot points to chew on

Some of ideas so far -

If .net standard 2.0 is still targeted, I don't think it would make FSharp.Data.Http more buggy. Agreed a caller on two different platforms would get a different result, but this is not really FSharp.Data.Http's problem. It is a runtime problem. I use many packages that have idiosyncrasy between platforms and if the reasons why are called out in the docs, I'm fair warned and more accepting.

I'm happy to participate should anyone else be willing wish to chew this off, as It would definitely be something I would use.

AlexGagne commented 2 years ago

In your suggestion, would using the Factory part be optional? My main use-case is simple calls with little configuration (I'm not sure if this is the main use case of people who use FSharp.Data.Http), so if there was a default HttpClient used in the simple cases, when HttpClient is not specified, that would work for me.

I'm a little bit wary for existing users who can only support .netstandard 2.0. It's not just that the behaviour is different between TFMs, but it's that the next version introduces a possible bug (stale DNS) in existing code unless the user makes changes. Even then, there is no correct solution the user can choose. They can use an HttpClient per request, leading to a possible port exhaustion, or reuse a single HttpClient and possibly lead into a stale DNS. If I was stuck on .netstandard 2.0, I would probably just not upgrade since the current version does not run into these issues.

ordinaryorange commented 2 years ago

In your suggestion, would using the Factory part be optional?

I don't see why not, HttpClient has a parameterless ctor. Have not thought through any implications, but I agree any changes need to make sure use of FSharp.Data.Http remain pain free for the caller

but it's that the next version introduces a possible bug (stale DNS) in existing code unless the user makes changes

If my assertions are correct, and that the FSharp.Data.Http API needs to change, then said users would have to make changes to their code base as the API surface will have changed.

Just researching the DNS matter a bit more. This article suggests stale DNS could be a problem even on platforms that use SocketsHttpHandler under the hood by default. As active connections are never closed. The caller has to explicitly deal with it.

According to the docs on HttpClient HttpWebRequest is used under the hood for .NET Framework, but I cant find any information on how DNS is handled here. I wonder if it suffers the same DNS issues already making the DNS discussion moot ?

kjreills commented 2 years ago

I ran some tests on the current implementation to see where this is at, and it appears socket exhaustion is already a potential problem with the current HttpWebRequest based implementation.

Based on that original HttpClient article I ran a simple program using the existing FSharp.Data.Http utility, ran netstat.exe, and lo and behold, I had a bunch of TIME_WAIT connections hanging out there:

The program:

open FSharp.Data

for i in [ 1..100 ] do
    Http.Request("http://something.local")  |> ignore

Netstat result:

  TCP    192.168.1.238:62635    something.local:9090           TIME_WAIT
  TCP    192.168.1.238:62636    something.local:9090           TIME_WAIT
  TCP    192.168.1.238:62637    something.local:9090           TIME_WAIT
  TCP    192.168.1.238:62638    something.local:9090           TIME_WAIT
  TCP    192.168.1.238:62639    something.local:9090           TIME_WAIT
  ... (truncated for brevity, but all ports in this range were used as well)
  TCP    192.168.1.238:62733    something.local:9090           TIME_WAIT
  TCP    192.168.1.238:62734    something.local:9090           TIME_WAIT

So, even if the library wrapped HttpClient and instantiated and disposed of it every time, it appears that the developer experience would not be degraded compared to the current implementation. But that's the bare minimum approach, and not likely the preferred solution. Instead, based on the Recommended Use section of the "HttpClient Guidelines", it appears the library has to consider the underlying Target Framework to some degree in order to follow best practice, that being:

Based on that, I would advocate for reshaping this library around the assumption that the relying developers are the best people to manage the HttpClient instances, since their use cases are many and varied, and those will greatly affect their approach to HttpClient usage. Some outcomes of this might be:

Anyway, those are just my thoughts, I would love to hear others' opinions on all this!

dsyme commented 2 years ago

Hmmmm interesting. Would be good to hear what other people think. It feels like we should at least make sure that we're following the recommended use (we should assume .NET Core 2.1+)

So adding the extra overloads sounds about right