Analyzer: Introduce an analyzer to recommend the more efficient string splitting features in .NET 8

geeknoid commented 1 year ago

Introduce a performance-centric analyzer that detects uses of String.Split and recommends the new more efficient string splitting functionality from MemoryExtensions. I'm not sure a fixer can be provided here, it might be too complicated for that. But having an analyzer to point out an obvious place that can get a perf boost is still useful.

Category : Performance Severity : Info

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-system-runtime See info in area-owners.md if you want to be subscribed.

Issue Details

Introduce a performance-centric analyzer that detects uses of String.Split and recommends the new more efficient string splitting functionality from MemoryExtensions. I'm not sure a fixer can be provided here, it might be too complicated for that. But having an analyzer to point out an obvious place that can get a perf boost is still useful.

Author:	geeknoid
Assignees:	-
Labels:	`area-System.Runtime`, `untriaged`, `code-analyzer`
Milestone:	-

buyaa-n commented 1 year ago

[Triage note]: We shouldn't flag for all string.Split cases, need to look for patterns where the new API could be used. Based on the pattern a fixer will be offered. When https://github.com/dotnet/runtime/issues/934 implemented we could replace more string.Split cases

Category : Performance Severity : Info

Question:

What pattern the analyzer should look for?
We might also want to flag String.Trim, String.TrimEnt, String.TrimStart and suggest new Trim, TrimEnd, TrimStart from MemoryExtensions. Maybe with different ID as these could have code-fixers.

bartonjs commented 1 year ago

Video

Out of a concern of false positives, it seems like the initial version of this analyzer should be limited to those callsites where a count was passed into string.Split, as that pattern has a perfect analogue in the span-based split.

Using uncounted string.Split in a foreach would be better handled by an iterator-based split (which we don't have yet). Other uses of uncounted split might be analyzable, but they're harder to describe, so they should probably be a different diagnostic ID (once the pattern can be analyzed) just to keep the docs/explanation sane.

The suggestion regarding Trim/TrimEnd/TrimStart may have merit, but seems like it might need some more thought (as it depends heavily on what happens after the call to Trim)... and should be addressed in a separate issue, with more details/examples.

Category: Performance Severity: Info

dotnet / runtime

Analyzer: Introduce an analyzer to recommend the more efficient string splitting features in .NET 8 #85487