Open davidfowl opened 4 years ago
some of them have already been prototyped in the experimental project.
I think it would be useful to have a description of what these are lacking before they can be moved into the main project.
How familiar would you have to be with the HTTP spec to work on these?
Decide on the representation for an HTTP request and body on the server side Decide on the representation for an HTTP request and body on the client side.
At the moment you're using System.Net.Http. Is there anything wrong with this?
I think it would be useful to have a description of what these are lacking before they can be moved into the main project.
The body parsers that are prototyped should be easy to make progress on without too much fuss. The easy ones are IMessageReader<ReadOnlySequence<byte>>
and IMessageWriter<ReadOnlySequence<byte>>
(maybe another form of byte as well).
At the moment you're using System.Net.Http. Is there anything wrong with this?
Yes I'm not happy with this, I did this because I was being lazy. I'd much rather have a lower level implementation that either didn't enforce a specific shape (callback based) or a representation offers access to a header at a time (which seems a little inefficient).
I think this will require some experimentation to get it right. I don't have the right answers as yet.
How familiar would you have to be with the HTTP spec to work on these?
These parsers aren't that difficult to write, there are a few implementations in ASP.NET Core (for readers).
It's also worth discussing header parsing separately as it's used in the request parser, the response parser and the multipart parser. It's possible we just need a header representation and not one for the entire request/response.
Ok. I'm going to start off real simple here.
Firstly I agree System.Net is not a great representation - it's very heavy, and very opinionated. Also it's quite likely we'll want to keep everything as bytes until as late as possible. For example, it's likely you'll want to use UTF8 rather than strings in some applications.
As such, I think I'll start off with the simplest possible component that will be unarguable: An HttpHeaderReader returning a
ref struct HttpHeader
{
public HttpHeader(ReadOnlySpan<byte> fieldName, ReadOnlySpan<byte> fieldValue)
{
FieldName = fieldName;
FieldValue = fieldValue;
}
public readonly ReadOnlySpan<byte> FieldName { get; }
public readonly ReadOnlySpan<byte> FieldValue { get; }
}
In the long term I'm thinking of something like this:
public interface IHttpRequestMessageBuilder
{
public void AddHeader(HttpHeader header);
public void AddVersion(Version version);
...
}
public HttpRequestMessageReader<TBuilder> : IMessageReader<TBuilder> where TBuilder : IHttpRequestMessageBuilder
{
public HttpRequestMessageReader(Func<TBuilder> createNewBuilder) => _createNewBuilder = createNewBuilder;
}
I think that would be quite easy to adapt to most representations of HttpRequests.
Long term we'll probably also want to provide helpers/parsers to aid with Well Known Http headers, but I think that's not necessary for the POC.
Agreed?
Also, do we want to support streaming http? Because IMessageReader isn't a suitable abstraction for that at the top level - i.e. it requires the entire bytes for the message is loaded into memory at once, which is fine for parsing request headers, but not ideal for parsing the entire http request.
Perhaps we ought to design an API suitable for streaming first, and then implement the non-streaming one on top of that?
As such, I think I'll start off with the simplest possible component that will be unarguable: An HttpHeaderReader returning a
That's basically the callback approach though I don't know if it makes sense returning the IHttpRequestMessageBuilder
. In this case, the IMessageReader isn't really returning anything (other than the IHttpRequestMessageBuilder
that was passed in). Not sure if it needs to be a Func<IHttpRequestMessageBuilder>
either. You've designed what we did in Kestrel 😄. Here are the interfaces in ASP.NET Core:
I think it's proven and a solid approach.
The other option would be the have the parser return a single struct (not ref struct) that returns the startline and each header value one at a time. It avoids having to allocate a dictionary, or invert the model to be a callback. Something like:
public readonly struct HttpRequestStartLine
{
public ReadOnlySpan<byte> Method { get; }
public ReadOnlySpan<byte> Path { get; }
public ReadOnlySpan<byte> Version { get; }
}
public readonly struct HttpHeader
{
public ReadOnlySpan<byte> FieldName { get; }
public ReadOnlySpan<byte> FieldValue { get; }
}
public class HttpRequestStartLineParser : IMessageReader<HttpRequestStartline> { }
public class HttpHeaderReader: IMessageReader<HttpHeader> { }
For example, it's likely you'll want to use UTF8 rather than strings in some applications.
Yes, this is interesting. @GrabYourPitchForks, this is a real use case for UTF8 string.
Long term we'll probably also want to provide helpers/parsers to aid with Well Known Http headers, but I think that's not necessary for the POC.
Yep! Agreed. Should add that in another issue actually. I forgot about those.
Agreed?
I'd like to understand the merits of the other approach but I'm pretty sold on the callback approach as well.
Also, do we want to support streaming http? Because IMessageReader isn't a suitable abstraction for that at the top level - i.e. it requires the entire bytes for the message is loaded into memory at once, which is fine for parsing request headers, but not ideal for parsing the entire http request.
Not sure what you mean here. If you're talking about the body then it works for that too. Your last pull request was basically the model for how that would work.
For example, here's the HttpClient implementation:
Startline + headers https://github.com/davidfowl/BedrockFramework/blob/fe439694c193db13708235ce654f3a51f9b7962b/src/Bedrock.Framework.Experimental/Protocols/HttpClientProtocol.cs#L43-L55
Body:
Your last PR actually was the last step in making this work well.
You've designed what we did in Kestrel 😄. Here are the interfaces in ASP.NET Core:
What is the reason we're not using this, and writing our own instead? I'd just like to get an idea for the aims before I start writing something that isn't fit for purpose...
public readonly struct HttpHeader { public ReadOnlySpan<byte> FieldName { get; } public ReadOnlySpan<byte> FieldValue { get; } }
That won't compile
That won't compile
Yea I know it wouldn't compile. I wasn't trying to make it compile (the data would be stored as a ReadOnlyMemory
public readonly struct HttpHeader
{
private readonly ReadOnlyMemory<byte> _name;
private readonly ReadOnlyMemory<byte> _value;
public HttpHeader(ReadOnlyMemory<byte> name, ReadOnlyMemory<byte> value)
{
_name = name;
_value = value;
}
public ReadOnlySpan<byte> Name => _name.Span;
public ReadOnlySpan<byte> Value => _value.Span;
}
What is the reason we're not using this, and writing our own instead? I'd just like to get an idea for the aims before I start writing something that isn't fit for purpose...
It's internal/pubternal unsupported.
It's internal/pubternal unsupported.
Any reason not to copy it and then develop as necessary though?
@YairHalberstadt Nope, it's a fine starting point. It needs to be shaped into the IMessageParser format is all.
IMessageParser
IMessageReader?
Thanks - that's enough information to get started for now!
The kestrel HttpParser doesn't seem to support line folding in headers.
https://github.com/dotnet/aspnetcore/issues/13923
I presume we don't need to either?
Also do we want to guarantee we'll throw on invalid data, or is it sufficient to always accept and correctly parse valid data?
The first will cause significant performance hits compared to the second...
I presume we don't need to either?
Start without it.
Also do we want to guarantee we'll throw on invalid data, or is it sufficient to always accept and correctly parse valid data?
We could return a failure as part of the parse result. Actually that might be even better (maybe a ParseResult<T>
).
PS: I added a gitter room https://gitter.im/BedrockFramework/BedrockFramework
@davidfowl I think the next step with Http1 is to create an Http1HeadersParser which parses all the headers in a message, and can be shared between the Request and the Response parser. If we want to maximize performance, it should use callbacks to avoid allocating a collection for the headers. What I'm not sure about is the best way to do error handling via such a model.
One option is to return a union indicating success, failure, or need more data, separately to using the callback. An alternative is to have an OnFailure method on the callback. Alternatively we could simply use exceptions, seeing as errors should be a very uncommon case in real life usage, and there isn't really a good way to recover from them
We should have a well tested blazing fast implementation of the various HTTP/1.x parsers:
Server
Client
Shared
Each of these can be implemented and tested separately (for the most part) and some of them have already been prototyped in the experimental project.