dotnet / roslyn-analyzers

MIT License
1.55k stars 460 forks source link

Unsubstantiated claim in CA1835 (prefer the memory-based overloads of ReadAsync/WriteAsync) #7328

Open macrogreg opened 3 weeks ago

macrogreg commented 3 weeks ago

Analyzer

CA1835: Prefer the memory-based overloads of ReadAsync/WriteAsync methods in stream-based classes

https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca1835

Problem

The rule claims that the Memory-based Stream method overloads have a more efficient memory usage than the byte array-based ones.

The reason / explanation for this claim It is not clear. It fact, it is not clear what this claim is true at all.

In fact, looking at the implementation of Stream.ReadAsync(Memory<byte> buffer, ...) I see that it delegates to the byte-array based implementation, plus does some (minor) additional stuff.

This suggests that the analyzer rule is either false, or that it is only applicable in certain scenarios, but it is not clear in which specific ones.

Solution options

The rule should be reviewed.

If the rule is false, it should be removed.

If the rule is true, the rule documentation should be improved to include an explanation of the reasons for the better performance of the Memory-bases APIs. Ideally, a similar explanation should also be included into the API docs of the affected Stream APIs. Ideally, the docs should point to some performance measurements, or to how they can be performed/replicated.

If the rule is true only in specific circumstances, it should be clearly explained when it does and does not apply, and why. Then, the docs should include all of the above for the context of the specific circumstances in which the rule applies.

Workaround

Completely suppress the rule, as it currently only creates noise with no value. In .editorconfig, add:

[*.{cs,vb}]
dotnet_diagnostic.CA1835.severity = none

Relevant info

It seems that this has been raised in the past: https://github.com/dotnet/docs/issues/36438 , however, that issue was not properly resolved.

gavinBurtonStoreFeeder commented 3 weeks ago

Thanks @macrogreg this might be a better place and you've written the question much better than I did

If there are no current benchmarks, I would be happy to take care of that myself, I'll use BenchmarkDotnet to test various situations

bzd3y commented 3 weeks ago

In fact, looking at the implementation of Stream.ReadAsync(Memory buffer, ...) I see that it delegates to the byte-array based implementation, plus does some (minor) additional stuff.

Isn't this reason alone to prefer the Memory-based implementation? It is handling the different scenarios for you (it doesn't just defer to the byte[] version, but has its own implementation with array pooling if the Memory<byte> isn't just a byte[]). I agree that it is good to understand what it is doing and perhaps the documentation could be improved there.

As far as I can tell, that is the point behind Memory<T> itself. It is just a wrapper for byte[] except for when it isn't. It provides more flexibility and allows for a standard or unform interface for scenarios that previously might be different.

I think this method and the preference for it would just come out of that.

gavinBurtonStoreFeeder commented 2 weeks ago

In fact, looking at the implementation of Stream.ReadAsync(Memory buffer, ...) I see that it delegates to the byte-array based implementation, plus does some (minor) additional stuff.

Isn't this reason alone to prefer the Memory-based implementation? It is handling the different scenarios for you (it doesn't just defer to the byte[] version, but has its own implementation with array pooling if the Memory<byte> isn't just a byte[]). I agree that it is good to understand what it is doing and perhaps the documentation could be improved there.

As far as I can tell, that is the point behind Memory<T> itself. It is just a wrapper for byte[] except for when it isn't. It provides more flexibility and allows for a standard or unform interface for scenarios that previously might be different.

I think this method and the preference for it would just come out of that.

Unfortunately that doesn't match up with what the rule says; it says that if one starts with a byte[] one should then wrap that in a Memory<byte> and use those overloads; because they use memory more efficiently

It's clear that they do not. But again, when I filed the original issue I was giving the tool the benefit of the doubt and asking for proof.