CommunityToolkit / dotnet

.NET Community Toolkit is a collection of helpers and APIs that work for all .NET developers and are agnostic of any specific UI platform. The toolkit is maintained and published by Microsoft, and part of the .NET Foundation.
https://docs.microsoft.com/dotnet/communitytoolkit/?WT.mc_id=dotnet-0000-bramin
Other
3.07k stars 299 forks source link

Support tokenize span using more than 1 separator #807

Open skarllot opened 11 months ago

skarllot commented 11 months ago

Overview

Sometimes a [ReadOnly]Span needs to be tokenized using more than one separator.

API breakdown

namespace CommunityToolkit.HighPerformance;

public static class SpanExtensions
{
    public static SpanTokenizer2<T> Tokenize<T>(this Span<T> span, T separator0, T separator1);
    public static SpanTokenizer3<T> Tokenize<T>(this Span<T> span, T separator0, T separator1, T separator2);
    public static SpanTokenizerAny<T> Tokenize<T>(this Span<T> span, ReadOnlySpan<T> separators)
}
namespace CommunityToolkit.HighPerformance;

public static class ReadOnlySpanExtensions
{
    public static ReadOnlySpanTokenizer2<T> Tokenize<T>(this ReadOnlySpan<T> span, T separator0, T separator1);
    public static ReadOnlySpanTokenizer3<T> Tokenize<T>(this ReadOnlySpan<T> span, T separator0, T separator1, T separator2);
    public static ReadOnlySpanTokenizerAny<T> Tokenize<T>(this ReadOnlySpan<T> span, ReadOnlySpan<T> separators)
}

Usage example

ReadOnlySpan<char> content = "John; 1960, USA";

foreach (var token in content.Tokenize(';', ',')
{
      Console.WriteLine(token.ToString());
} 

Breaking change?

No

Alternatives

Not that I'm aware of.

Additional context

The new tokenizer structs can use IndexOfAny instead of IndexOf.

Help us help you

Yes, I'd like to be assigned to work on this item

skarllot commented 7 months ago

Can this feature be implemented?