Open davidfowl opened 5 years ago
Should we consider adding these APIs on SequenceReader
instead (or potentially in addition to)?
Something like these extension methods: https://github.com/dotnet/corefxlab/blob/94248fb2ea9881936d6338b0d2d62ca8c79cbd79/src/System.Buffers.ReaderWriter/System/Buffers/Reader/BufferReader_text.cs#L239-L275
The bytesConsumed
parameter doesn't make too much sense for a SequencePosition
. How about consumedPosition
(or sequenceConsumed
, consumedUntil
, etc.)?
cc @JeremyKuhne
I originally had that but the problem is that with text based protocols are usually delimited. For example, say you wanted to parse an HTTP/1.1 response (e.g. HTTP/1.1 200 OK\r\n). You want to write code that looks like this:
// HTTP/1.1
if (!sequenceReader.TryReadTo(out ReadOnlySpan<byte> version, (byte)' '))
{
return;
}
// 200
if (!sequenceReader.TryReadTo(out ReadOnlySpan<byte> statusCodeText, (byte)' '))
{
return;
}
else if (!Utf8Parser.TryParse(statusCodeText, out int statusCode, out _))
{
return;
}
// OK
if (!sequenceReader.TryReadTo(out ReadOnlySequence<byte> statusText, NewLine))
{
return;
}
If we add these parser APIs, how do you avoid parsing the status code into an integer until you have the full payload?
// HTTP/1.1
if (!sequenceReader.TryReadTo(out ReadOnlySpan<byte> version, (byte)' '))
{
return;
}
// 200
// This code is incorrect because it'll parse 2, 20, and 200 successfully. So we need to scan for a space before we start parsing to make sure we have the full payload.
if (!sequenceReader.TryParse(out int statusCode))
{
return;
}
// OK
if (!sequenceReader.TryReadTo(out ReadOnlySequence<byte> statusText, NewLine))
{
return;
}
(Standard grumbling that this belongs on an AsciiParser
class, not a Utf8Parser
class, since presumably you want these APIs to fail if they encounter non-ASCII data.)
The APIs seem reasonable. We'd have to be a little careful with the implementation because each method would need to know the worst-case input length based on the out T
and the char standardFormat
parameters. It's likely that any introduced APIs will have bugs in edge case handling, especially for types like double
where the input buffer could be arbitrarily long. That doesn't mean we shouldn't do them, but it does potentially mean we should consider carefully which overloads we want to ship with.
I'd love to find a more general way to tackle this. Adding these convenience overloads for everything parsing a Span
will get tedious, and it would be tacit encouragement to parse ReadOnlySequence
directly.
I don't know if I'd want it in corefx but an extension on ReadOnlySequence
could make this simple:
// return FirstSpan if single element, otherwise copy to storage and return storage.
ReadOnlySpan<byte> ReadOnlySequence.MaybeCopyTo(Span<byte> storage);
Utf8Parser.TryParse(byteSequence.MaybeCopyTo(stackalloc byte[128]), out int statusCode, out int bytesConsumed);
I'd love to find a more general way to tackle this. Adding these convenience overloads for everything parsing a Span will get tedious, and it would be tacit encouragement to parse ReadOnlySequence directly.
Case by case seems fine but if we're looking at language features then I rather not block that to add these very simple overloads that I keep re-writing ๐ .
The APIs seem reasonable. We'd have to be a little careful with the implementation because each method would need to know the worst-case input length based on the out T and the char standardFormat parameters. It's likely that any introduced APIs will have bugs in edge case handling, especially for types like double where the input buffer could be arbitrarily long. That doesn't mean we shouldn't do them, but it does potentially mean we should consider carefully which overloads we want to ship with.
We're just delegating to the Span overloads for most of these (I would assume all check for overflow). Isn't double the only special case? Doesn't it already have to be handled by the span overload?
Isn't double the only special case? Doesn't it already have to be handled by the span overload?
Not sure if it's the only special case. And the span overloads can assume that they have all the data, which means that edge case handling is simpler. As an example of something that might be error-prone: what happens if the input is a 100GB stream consisting of all zeroes and you pass such a stream to the double parser? We can't delegate to the span parser because it can't handle payloads that large.
Again, not an argument that we shouldn't do it, but more of an admission that the more we try to support the higher the odds of this being a bug farm. So we'd want to weigh carefully the tradeoffs of having an API that works directly with sequences vs. telling customers to find the delimiters themselves, slice, and use the span-based APIs.
Not sure if it's the only special case. And the span overloads can assume that they have all the data, which means that edge case handling is simpler.
Why can't these overloads assume the same thing?
As an example of something that might be error-prone: what happens if the input is a 100GB stream consisting of all zeroes and you pass such a stream to the double parser? We can't delegate to the span parser because it can't handle payloads that large.
What happens if you do that for the span overload?
What happens if you do that for the span overload?
If you pass int.MaxValue
worth of zeroes to the span overload? It should handle them just fine and return 0.0
.
Constrain the input size to int.MaxValue for ReadOnlySequence
For example, say you wanted to parse an HTTP/1.1 response (e.g. HTTP/1.1 200 OK\r\n). You want to write code that looks like this:
Is this example how you imagine writing production quality high-performance HTTP response parser, e.g. are you going to use these APIs in Kestrel if we add them?
That would be ideal if the performance drop wasn't terrible. We started to write an HTTP client for performance testing here https://github.com/sebastienros/httpsocket/blob/38f3ee6d4a357fa60fec4a8b6cf802ce7e921924/src/HttpSocket.cs#L146-L175 and it uses the above code.
So what is this code fragment going to look like with this API?
I would use the ReadOnlySequence overload instead I get the data instead of ReadOnlySpan.
This isnโt the main scenario though, itโs when you have parsed some sequence of date then you need to further parse and slice to get the right section, then you want to turn it into some primitive
And would you still use SequenceReader? I think we have way too many of these parsing helper APIs and it is getting very difficult to figure out what is the right combination to use for any given situation.
And would you still use SequenceReader? I think we have way too many of these parsing helper APIs and it is getting very difficult to figure out what is the right combination to use for any given situation.
I don't think it's complicated (having written a bunch of these parsers now). The main problem is that nothing takes a ReadOnlySequence so once you've parsed and narrowed your search, you're left with the complicated logic to materialize the bytes into something else. It's very much like trying to use Span<T>
on .NET Standard 2.0.
(we just added these thanks @GrabYourPitchforks!)
This is the perfect example, parsing a chunked prefix in HTTP/1.1. The payload looks like this:
a\r\n
Helloworld\r\n
0\r\n
\r\n
I want to parse the hex number (a for example) after grabbing all text up to the newline.
So the code pattern is that the caller slices the input based on some delimiter (such as CRLF), then passes the sliced sequence to these parsing APIs? That is somewhat different than the behavior of the span-based APIs, which consume input until a delimiter is encountered.
So the code pattern is that the caller slices the input based on some delimiter (such as CRLF), then passes the sliced sequence to these parsing APIs?
In these particular examples that are handling streaming data it has to be done this way. Parsing partial integers would succeed so we need to get the entire "frame" first.
That is somewhat different than the behavior of the span-based APIs, which consume input until a delimiter is encountered.
Yes, I guess that's why the bytesConsumed parameter exists. Generally though this approach applies here as well when using these APIs in the same way. I now see your original concern about overflow and edge cases, but that can be mitigated by having a default limit on the input (even though it's a ReadOnlySequence).
So what we're really talking about is a behavioral equivalent to int.TryParse
, not Utf8Parser.TryParse(ROS<byte>, out int)
?
@GrabYourPitchforks I think that would be reasonable yes. That way we can call safely delegate to the the Span overloads. I'd prefer if we kept these implementations super cheap to implement. So far we've scoped it:
Do we also need to add these overloads for the ReadOnlySpan<byte>
variant (maybe it could be a different request)? I've updated the issue to reflect this.
Just to set baseline expectations, these are not going to be cheap to implement if you're looking for behavioral parity with existing TryParse
methods. Any length restriction we impose would be artificial (and honestly could be done by the caller without us writing a helper method), so the likely result is that we'll end up duplicating the existing parsing logic into these methods.
Just to set baseline expectations, these are not going to be cheap to implement if you're looking for behavioral parity with existing TryParse methods. Any length restriction we impose would be artificial (and honestly could be done by the caller without us writing a helper method), so the likely result is that we'll end up duplicating the existing parsing logic into these methods.
OK, well that's unfortunate but I don't really have any push back. The only unfortunate thing here basically leads back to what @scalablecory mentioned. There's no way to have an efficient byte enumerator that would be more efficient with span and less efficient with ReadOnlySequence/SequenceReader. That might be worth looking into if we end up needing to duplicate all of these methods.
@AndyAyersMS @jaredpar @jkotas Here's a question for you both:
bool TryParseInt(ByteEnumerator enumerator, out int value)
{
int result = 0;
while (enumerator.TryGetNext(out byte b))
{
var digit = b - '0';
result = result * 10 + digit;
}
return result;
}
We need a language and JIT and compiler feature that lets me write code like the above but can then turn it into the equivalent of:
bool TryParseInt(Span<byte> span, out int value)
{
int result = 0;
for (var i = 0; i < span.Length; i++)
{
var digit = span[i] - '0';
result = result * 10 + digit;
}
return result;
}
If the ByteEnumerator is a Span<byte>
Agree - having something like the ref interface proposal would definitely help here. Then we could potentially reuse the same logic for Span and ReadOnlySequence and rely on generic specialization to make things more efficient.
@davidfowl @GrabYourPitchforks C++ enables you to do this with templates and iterators and I've employed it very successfully in the past for high-perf parsing. I'd love it we could get better parity there.
When I tried to do this myself in C# I hit two roadblock features I needed:
ref struct
to implement interfaces.ref struct
generic constraint that enforces "you must use this generic as if it were a ref struct". Then you can do e.g.ref struct SpanEnumerator : IByteEnumerator
{
// ...
}
struct SequenceEnumerator : IByteEnumerator
{
// ...
}
bool TryParseHelper<ByteEnumerator>(ByteEnumerator enumerator, out int value) where ByteEnumerator : IByteEnumerator, ref struct
{
// ...
}
bool TryParse(ReadOnlySequence<T> sequence, out int value)
{
return TryParseHelper(CreateSpanEnumerator(sequence.FirstSpan), out value)
|| TryParseHelper(CreateSequencEnumerator(sequence), out value);
}
I briefly touched on this in https://github.com/dotnet/csharplang/issues/1479#issuecomment-463488171
C++ also makes it really easy via traits/ADL/SFINAE/etc. to automatically select e.g. the most efficient IndexOf
for an iterator type -- I'd love that for this as well, but that is significantly more complex of a feature to ask for :)
I agree ref interfaces or ref constrains would be useful.
I have doubts whether the specific helper methods discussed here are useful enough to be in the core framework. I believe that there are two mainstream categories of parsers to worry about:
The method proposed here do not seem to be either category.
@jkotas I agree that ReadOnlySequence<T>
isn't performant for parsing. It's job was to be an representation of a single or multi-segment buffer that would be as efficient as possible while in a single buffer, but pay the cost when you needed to span buffers. While that came with some tradeoffs, we have this type in the BCL now and I'd like more ways to interact with them in the BCL natively without having to write helpers to do the most basic parsing.
Helper method like these are not useful for high-performance production-grade parsers.
That's subjective. These helpers could be very fast in the case of parsing a single segment ROS and slower when we need to cross buffers. Once you've decided to represent a buffer with this type you're already in that world so I don't think this argument holds true.
No matter how you feel about that, users have to write the above code today and I want to remove that need in the next version.
Any length restriction we impose would be artificial (and honestly could be done by the caller without us writing a helper method)
@GrabYourPitchforks prior ART https://github.com/dotnet/corefxlab/blob/94248fb2ea9881936d6338b0d2d62ca8c79cbd79/src/System.Buffers.ReaderWriter/System/Buffers/Reader/BufferReader_text.cs#L15
We'll see if anybody complains ๐
EDIT: Wait, that was corefxlab ๐
That's subjective. These helpers could be very fast
It is not just about performance. It is also about usability. What would help to convince me if you show me a delta how these methods will be used in Utf8JsonReader, Kestrel (or some other production grade parser). The motivating example was from a test code and it was not very convincing.
No matter how you feel about that, users have to write the above code today and I want to remove that need in the next version.
We should focus on designing scenario: Parsing data coming from ReadOnlySequence. We have a SequenceReader for that today. I find it confusing that we are not improving SequenceReader and instead trying to add an alternative way for parsing ReadOnlySequence that is not SequenceReader . How are the users of the platform supposed to pick the right way to parse the ReadOnlySequence ?
Compare this with the simple story for TextReader that does not have ambiguity like this.
It is not just about performance. It is also about usability.
The general pattern for using SequenceReader is to the methods to narrow down a Span or Sequence of content to further parse. We have many examples of production parsers already described in this thread.
Kestrel's header parser https://github.com/aspnet/AspNetCore/blob/dba39d60c89b743d45411ba6bc6b02b5a8019032/src/Servers/Kestrel/Core/src/Internal/Http/HttpParser.cs#L180 (ignore the pointers, those will go away soon ๐)
The Form Parser https://github.com/aspnet/AspNetCore/blob/dba39d60c89b743d45411ba6bc6b02b5a8019032/src/Http/WebUtilities/src/FormPipeReader.cs#L219
The motivating example was from a test code and it was not very convincing.
I don't really understand why it wasn't, pretend that code was in HttpClient, that's what we're currently building. It happens to be a client used for performance testing but I'm not sure why that makes a difference.
We should focus on designing scenario: Parsing data coming from ReadOnlySequence. We have a SequenceReader for that today. I find it confusing that we are not improving SequenceReader and instead trying to add an alternative way for parsing ReadOnlySequence that is not SequenceReader . How are the users of the platform supposed to pick the right way to parse the ReadOnlySequence ?
I thought I clarified that but maybe I was unclear. First the specific example:
Can you tell me how that parsing would look if we added TryParseInt to SequenceReader?
// Find the space
if (sequenceReader.TryReadTo(out ReadOnlySequence<byte> statusCodeText, (byte)' '))
{
// Parse the int
var subReader = new SequenceReader<byte>(statusCodeText);
if (subReader.TryParseInt(out statusCode))
{
}
}
That's syntactically noisy, expensive and unnecessary.
Compare this with the simple story for TextReader that does not have ambiguity like this.
TextReader can a read a line, the entire payload or blocks or chars. It really doesn't do anything like what we're describing here (other than SequenceReader and TextReader sharing the Reader suffix).
Alternative Consideration:
I want to summarize the patterns I've seen in the 3 years of using these APIs "in anger" writing and reviewing lots of parsers (both for play and production). There are only a handful of patterns that need supporting to work for a extremely LARGE range of scenarios:
SequenceReader IMO solves the problem of walking the ReadOnlySequence<byte>
in sane way, once you have narrowed your way down to a single token, the next step is to materialize into some other form. If It's unclear what I mean, here are some examples:
IMO I do think the ideal solution is to get to a span as quickly as possible, then parse (if the data isn't large),
The options are:
Hybrids of the above approaches are also valid and I've seen them in the wild.
Maybe it's reasonable for primitives like this to incur the "maybe copy" by using overloads like this https://github.com/dotnet/corefx/issues/42252. If we combine this with a further API to do what @scalablecory suggests (but maybe on the reader as well), it might be the best of all worlds but I want to get your feedback.
Thoughts?
I do think the ideal solution is to get to a span as quickly as possible, then parse
Agree.
Parse byte by byte or span by span. Some parsers do this (like JSON) and it can be faster but results in re-implementing various primitive parsers.
Yes, that is the way to implement high-performance production-grade parsers. I do not think any amount of helper APIs is going to change that.
Copy the ReadOnlySequence to a Span Copy only in the multi-segment case
Yes, I think APIs along these lines are the most powerful building blocks for lower-performance handrolled parsers that just need to get the job done.
For example, I think it would be useful to have ReadLine
API for SequenceReader<byte/char>
that returns Span and allocates the buffer to cover the splits as necessary. This Span can be then sliced and diced as necessary using regular Span APIs, including number parsing Span APIs. This is the same pattern one uses with TextReader today.
I agree ref interfaces or ref constrains would be useful.
In terms of language design ref interfaces / constraint is a fairly straight forward extension of our existing work. The ref constraint in particular was discussed at the time of ref struct and the behavior is mostly understood.
The tricky part is, as per usual, all features start at -100 so need to justify the cost is worth it. :smile:
I have a doc I'm already working on for improvements to ref structs that I hope to start sharing around in late November (just booked solid until then). I will add ref constraints and interfaces to the list.
@davidfowl
Instead of
bool TryParseInt(IByteEnumerator enumerator, out int value)
I think we'd effectively need
bool TryParseInt<T>(T enumerator, out int value)
where T: ref struct, IByteEnumerator
Note: changed ByteEnumerator
to IByteEnumerator
to make it more clear this is meant to be an interface
(at least in my sample)
The method can't be non-generic because that would imply that IByteEnumerator
is a boxed value. Even if we supported ref interface
instead of allowing ref struct
to implement normal interfaces my expectation is a non-generic instance would always require some form of boxing and hence be illegal.
Is that aligned with everyone else's expectations here?
@jaredpar yes.
@jaredpar yes.
@jkotas
Yes, that is the way to implement high-performance production-grade parsers. I do not think any amount of helper APIs is going to change that.
I would beware about how we use the term "production-grade", I would swap that with high performance. But I agree. Arguably, byte by byte parsers aren't any more difficult with ReadOnlySequence/SequenceReader. The forward only nature does make it hard to jump around though.
See https://github.com/dotnet/corefx/blob/279292804fa4f6cb856577e59246ac934a2ef9e5/src/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/HttpConnection.cs#L1262, which uses Array.IndexOf
to scan for tokens.
Yes, I think APIs along these lines are the most powerful building blocks for lower-performance handrolled parsers that just need to get the job done.
I think the reader already exposes enough to do this.
For example, I think it would be useful to have ReadLine API for SequenceReader<byte/char> that returns Span and allocates the buffer to cover the splits as necessary.
What happens if the line is really long? Just allocate a long byte[] or return another ReadOnlySequence<byte/char>
? That's the crux of the issue here. When you've found a single token, what form do you get it in and how to do materialize it. Your answers still haven't covered that scenario.
This Span can be then sliced and diced as necessary using regular Span APIs, including number parsing Span APIs. This is the same pattern one uses with TextReader today.
The difference being that SequenceReaeder doesn't have to allocate a buffer, that might be the difference. It's similar to the question above, which remains unanswered. If tokens are small enough, Span is fine, otherwise you can get another ReadOnlySequence and futher slice that up into spans. The problem with the latter approach is that it requires getting another SequenceReader.
As an example:
Key2=Value1\r\n
Key2=Value2\r\n
\r\n
Take this imaginary format, you can write the following code:
void ParsePair(ref SequenceReader<byte> reader, out string key, out string value)
{
// This may allocate
reader.TryReadTo(out ReadOnlySpan<byte> keySpan, '=');
// This may allocate
reader.TryReadTo(out ReadOnlySpan<byte> valueSpan, NewLine);
key = Encoding.UTF8.GetString(keySpan);
value = Encoding.UTF8.GetString(valueSpan);
}
OR you can get the entire line:
void ParsePair(ref SequenceReader<byte> reader, out string key, out string value)
{
// After we add https://github.com/dotnet/corefx/issues/42252
reader.TryReadTo(out ReadOnlySpan<byte> line, NewLine); // This might allocate
int eq = line.IndexOf('=');
ReadOnlySpan<byte> keySpan = s[0..eq];
ReadOnlySpan<byte> valueSpan = s[(eq + 1)..^2];
key = Encoding.UTF8.GetString(keySpan);
value = Encoding.UTF8.GetString(valueSpan);
}
I think either approach is fine but one may be more prone to allocations depending on the maximum size of the token and the segment size of the ReadOnlySequence.
Here's a summary of the 3 approaches I think we could take and the pros and cons of each:
ReadOnlySpan<byte/char>
is enoughvoid Parse(ref SequenceReader<byte> reader)
{
// This might allocate a large byte[]
if (!reader.TryReadTo(out ReadOnlySpan<byte> line, NewLine))
{
return;
}
if (Utf8Parser.TryParse(line, out int value, out int bytesConsumed))
{
line = line.Slice(bytesConsumed);
}
}
ReadOnlySpan<byte/char>
equivalentvoid Parse(ref SequenceReader<byte> reader)
{
// This doesn't allocate
if (!reader.TryReadTo(out ReadOnlySequence<byte> line, NewLine))
{
return;
}
var lineReader = new SequenceReader<byte>(line);
if (lineReader.TryParse(line, out int value))
{
// 0 allocations
}
}
ReadOnlySequence<byte/char>
to various parts of the BCLReadOnlySpan<byte/char>
equivalentvoid Parse(ref SequenceReader<byte> reader)
{
// This doesn't allocate
if (!reader.TryReadTo(out ReadOnlySequence<byte> line, NewLine))
{
return;
}
if (Utf8Parser.TryParse(line, out int value, out SequencePosition bytesConsumed))
{
line = line.Slice(bytesConsumed);
}
}
Requires adding lots of methods to SequenceReader to parse textual data Requires adding overloads that take ReadOnlySequence<byte/char> to various parts of the BCL
This is same "lots of methods" in both cases, right?
@jkotas but different because one is all on a single type VS being on various types in the BCL.
You need a concrete T
for these parsers that makes it impossible to add them as instance methods to SequenceReader<T>
itself. So they would be extension methods, spread over as may types as you would like.
You need a concrete T for these parsers that makes it impossible to add them as instance methods to SequenceReader
itself. So they would be extension methods, spread over as may types as you would like.
Logically they hang off SequenceReader<T>
, which is different than changing lots of other places.
I spent a little time writing a something to parse a basic http/1.1 request start line, no error handling, best case scenario and got these results:
BenchmarkDotNet=v0.12.0, OS=Windows 10.0.18362
Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=5.0.100-alpha1-015523
[Host] : .NET Core 3.1.0 (CoreCLR 4.700.19.52202, CoreFX 4.700.19.52317), X64 RyuJIT
DefaultJob : .NET Core 3.1.0 (CoreCLR 4.700.19.52202, CoreFX 4.700.19.52317), X64 RyuJIT
| Method | Mean | Error | StdDev | Ratio | RatioSD | Gen 0 | Gen 1 | Gen 2 | Allocated |
|------------------ |---------:|---------:|---------:|------:|--------:|-------:|------:|------:|----------:|
| SequenceParse | 161.1 ns | 3.22 ns | 3.16 ns | 1.25 | 0.04 | 0.0477 | - | - | 200 B |
| StreamReaderParse | 498.2 ns | 26.79 ns | 32.90 ns | 3.89 | 0.25 | 0.8678 | - | - | 3632 B |
| SpanParse | 128.6 ns | 2.65 ns | 2.95 ns | 1.00 | 0.00 | 0.0477 | - | - | 200 B |
| SequenceFastParse | 148.1 ns | 3.01 ns | 4.22 ns | 1.16 | 0.04 | 0.0477 | - | - | 200 B |
| ByteByByteParse | 205.9 ns | 3.57 ns | 3.34 ns | 1.60 | 0.05 | 0.0401 | - | - | 168 B |
https://gist.github.com/davidfowl/323a5a037700ad49692b97c0afff69f6
The StreamReader
is obviously a bit of a troll but I wanted to show that even though the SequenceReaeder
is slower than the direct Span API, it doesn't allocate it's possible to write pretty convenient APIs that are reasonably fast.
The SequenceFastParse
also shows that with a little more effort, it can be brought a bit closer to Span performance at the cost of a bit more complex code (slow path, fast path).
@davidfowl The benchmark sample you linked doesn't use the TryParse
methods you were proposing. Was this intentional?
Yes (but I can add it if need be), the conversation is now about what "production ready" parsers look like. I stand by what i said earlier in this thread and I just wanted to have some concrete data to back it up.
I want to close on a strategy based on these options here https://github.com/dotnet/corefx/issues/42255#issuecomment-549705509 and then push forward with additions in that direction.
@davidfowl I think your byte-by-byte parser isn't realistic. It should have an implementation similar to SequenceReader
.
@scalablecory Send me the codez and I'll update the benchmark.
@davidfowl Here's more what I had in mind -- faster than your version, but JIT unfortunately defeats much of the benefit... this is basically "free" in C++ but has considerable overhead here. Imagine the two helper methods are a single generic one (they have same code).
https://gist.github.com/scalablecory/4b1320044b911c5795531731d3d86f6a
@GrabYourPitchforks to your point, will add an example of the a basic Utf8Parser.TryParse with the various techniques to help answer @jkotas 's questions around performance. I'm coming back to just adding the originally proposed overloads based on the existing data but lets try some more things.
@jkotas 's questions around performance
The question for me is whether the cost of these new APIs (confusion about what are the right parsing APIs to use, code-bloat associated with having yet another set of parsing APIs for everything, ...) is worth it to avoid some GC heap allocations in less common cases.
These text parsers are very convenient when parsing text based protocols (like HTTP/1.1 and RESP). It would be great to have overloads of the existing APIs that support
ReadOnlySequence<byte>
. I end up writing code like this:I would be great if this was built into the existing parser.
cc @ahsonkhan @GrabYourPitchforks