dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15k stars 4.67k forks source link

Consider moving System.IO.Pipelines into the shared framework #28760

Closed ahsonkhan closed 4 months ago

ahsonkhan commented 5 years ago

Motivation: The recent in-box JSON serializer APIs (see https://github.com/dotnet/corefx/issues/34372 / https://github.com/dotnet/apireviews/tree/master/2019/System.Text.Json-Serialization) expect to take PipeReader/PipeWriter. This results in System.Text.Json depending on System.IO.Pipelines. However, the Pipelines package is not part of the shared framework and hence cannot be referenced by S.T.Json.

See discussion here: https://github.com/dotnet/apireviews/pull/91#discussion_r256595035

Can we move parts of System.IO.Pipelines (or all of it) in-box?

If we consider Pipeline APIs "siblings" to streams, then they could be pushed down low enough within the stack for S.T.Json to take a dependency on it (but maybe not low enough to be pushed to corelib). Is that an appropriate view of pipelines? What are the risks/constraints with moving System.IO.Pipelines inbox?

Possible alternatives to resolve the motivation differently: 1) The JSON serializer instead accepts some other abstraction that is implemented within System.Memory (an assembly it already depends on).

cc @davidfowl, @steveharter, @pakrym, @joshfree, @terrajobst, @jkotas, @stephentoub, @bartonjs, @ericstj, @KrzysztofCwalina

jkotas commented 5 years ago

on the writer/serializer side, an extension of the IBufferWriter interface, IAsyncBufferWriter

This seems to be the direction we started heading towards when we have introduced IBufferWriter interface. Do we understand what would the design of these interfaces look like?

The JSON serializer doesn't provide direct PipeReader/PipeWriter support at all (less than ideal).

Do you have a number for how much performance is going to be left on the table if we do nothing? (e.g. in TE benchmarks)

davidfowl commented 5 years ago

I don’t understand the question Jan. ASP.NET Core is on the path to pipelines (we’re exposing it in our public API and implanting it natively in kestrel and all of the servers).

The left on the table question IMO is irrelevant. We need to add pipelines support to the serializer for all ASP.NET Core developers, that’s the path we’re currntly on and this is a step in that direction.

That’s said, we can totally introduce new abstractions that we think are more core than pipelines and stick them in corlib or (System.Buffers) as long as it looks similar 😬. At that point why not just push pipelines all the way down.

davidfowl commented 5 years ago

There’s a 5th option to add to the list that I also hate but for the sake of completeness:

jkotas commented 5 years ago

The left on the table question IMO is irrelevant.

The write up says "less than ideal" without additional explanation. I have asked regular manager-type question to quantify it.

ASP.NET Core is on the path to pipeline

That's fine. Pipelines are fine tuned for webserver workload. It makes sense to use them in ASP.NET plumbing.

just push pipelines all the way down.

The pipeline looks very complex to me (e.g. compared to Stream). Maybe it needs to be as complex. I do not know. It is the reason for the second question.

ahsonkhan commented 5 years ago

Do you have a number for how much performance is going to be left on the table if we do nothing? (e.g. in TE benchmarks)

I don't have perf numbers for not directly supporting PipeReader/PipeWriter, especially since the serializer isn't fully baked yet/utilized within aspnet. It would depend on what the usage would look like to support pipelines with the other API overloads. @steveharter, could you help with getting some preliminary measurements?

The write up says "less than ideal" without additional explanation. I have asked regular manager-type question to quantify it.

Other than potential perf loss (which I agree needs to be quantified), that comment comes mainly from a usability perspective on behalf of up-stack usages within the aspnetcore repos which becomes more difficult as more of it depends on pipelines (i.e. all usages would have to work around this limitation). Any library developer depending on pipelines would have this usability concern as well.

davidfowl commented 5 years ago

That's fine. Pipelines are fine tuned for webserver workload. It makes sense to use them in ASP.NET plumbing.

They're not, they are just an abstraction like Streams. Usable for doing any IO. Client or server or anything else really (that's why the stream adapters are so easy to write).

I don't have perf numbers for not directly supporting PipeReader/PipeWriter, especially since the serializer isn't fully baked yet/utilized within aspnet. It would depend on what the usage would look like to support pipelines with the other API overloads. @steveharter, could you help with getting some preliminary measurements?

While I think we should have these, this should not be part of this discussion, so lets not discuss performance here.

The pipeline looks very complex to me (e.g. compared to Stream). Maybe it needs to be as complex. I do not know. It is the reason for the second question.

I'll leave this code here:

Stream code:

https://github.com/dotnet/corefxlab/blob/e075d78df60452b68d212e3333fd3f37cd28d4f0/src/System.Text.JsonLab.Serialization/System/Text/Json/Serialization/JsonSerializer.Read.Stream.cs#L58-L115

Pipe code:

https://github.com/dotnet/corefxlab/blob/e075d78df60452b68d212e3333fd3f37cd28d4f0/src/System.Text.JsonLab.Serialization/System/Text/Json/Serialization/JsonSerializer.Read.Pipe.cs#L53-L70

"Complex" is in the eye of the beholder 😄

jkotas commented 5 years ago

lets not discuss performance here.

If the performance is not a factor, then we can pass Stream everywhere and be done with it. No?

My understanding was that the reason for doing this is performance. Pipelines have a lot of overlap with streams. The added value of pipelines over System.IO.Stream is cooperative buffer management protocol that allows you to avoid a memory copy and throttle amount of buffering. Both these are performance optimizations.

jkotas commented 5 years ago

I'll leave this code here:

IMHO, the difference has more to do with the fact that this code was written for pipelines first and stream second. If it was done the other way around, the pipelines code would be more complex and the streams code would be simpler.

davidfowl commented 5 years ago

We already have streams everywhere. Performance is a factor but I don’t want this discussion to turn into streams vs pipelines.

Instead I’d like to discuss and agree on a couple of things:

I’m ok going in the direction of thin abstractions being pulled lower into the stack as long as we don’t lose the benefits of the abstractions.

jkotas commented 5 years ago

Fair enough. So this issue should rather be called: "Decide on guidelines for using pipelines abstractions in netcoreapp".

davidfowl commented 4 years ago

Spoke to @stephentoub and he's OK with this being part of .NET Core App. There's nothing confirmed about whether we actually use it anywhere yet though.

davidfowl commented 3 years ago

Anything preventing us from doing this in 6.0? It ended up getting punted in 5.0.0 because we backed out System.Net.Connections but it isn't related to that effort. Could we get this in for 6.x?

stephentoub commented 3 years ago

In general I want to avoid adding assemblies to netcoreapp unless there's an actual need to do so. What drives that need here at this point?

davidfowl commented 3 years ago

It's already in the ASP.NET Core shared framework and it's generally useful outside of that. If the bar is that it has to be used by other code in Microsoft.NetCore.App (and I can find counter examples...), but I understand. We don't really have any hard and fast rules for this.