Closed caesar-chen closed 4 years ago
cc @Tratcher
+@shirhatti
@JunTaoLuo isn't GRPC primarily interested in HTTP/2 trailers?
Comparing Case 1 and Case 2 calls out a hidden state field devs may need access to: weather or not a given instance of HttpContent has been read/buffered. If I examine TrailingHeaders and find it empty it's ambiguous to me if that's because the response is incomplete or if there were no trailers. HttpContent needs an API like bool IsComplete { get; }
.
HttpResponseMessage.TrailingHeaders
vs HttpContent.TrailingHeaders
? Trailers are more often descriptions of the content which is why they can't be generated in advance. Consider if I'm writing APIs that consume the HttpContent and then need to verify a hash from the trailers like Content-MD5? There's no back-reference from HttpContent to HttpResponseMessage.
I anticipate a similar scenario for generating request trailers based on the content. We should have an idea for how request trailers would work even if we don't implement them at the same time. I'd expect the contract to be that request trailers must be provided before HttpContent.SerializeToStreamAsync completes, and will be sent immediately afterwards.
@Tratcher yes we need HTTP/2 trailers in the response for gRPC, as mentioned by @caesar1995.
I can work on HTTP/2 implementation first. It's similar - the HEADERS frame starting the trailers header block is sent after DATA frames have been sent.
HttpResponseMessage.TrailingHeaders vs HttpContent.TrailingHeaders?
According to RFC 7230:
When a chunked message containing a non-empty trailer is received, the recipient MAY process the fields (aside from those forbidden above) as if they were appended to the message's header section.
I think HttpResponseMessage.TrailingHeaders
is more appropriate. But your example is interesting, I'd like to hear more people's feedback on this.
I anticipate a similar scenario for generating request trailers based on the content.
Let me double check the gRPC spec to make sure if we need this on client side.
we don't need request trailers for the work that's currently planned for gRPC.
That's good to know for scoping, but if possible, I'd like to see API design for both request and response. They should be consistent. Then we can decide to implement just half of it (the response trailing headers) in 3.0.
Per offline discussion today:
HttpResponseMessage.TrailingHeaders
. In order to get HttpContent
, developer will always get HttpResponseMessage
first.The API review is scheduled for next Tuesday. In the meantime, I will follow up with WinHttp team about stream buffering in HTTP2 - to explore the possibility that always have TrailingHeaders
information ready before returning the HttpResponseMessage
.
I took a look at the relevant code, and I think I can actually answer your question about stream buffering. We currently maintain a per-stream response buffer in Http2Stream: https://github.com/dotnet/corefx/blob/ee57e77c81a18ced855685f519b521e6a34d9821/src/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2Stream.cs#L50
This keeps individual streams from being able to block the whole connection if the consumer isn't actively reading data. We won't block the whole connection until we hit the window size limit. If we get to that point, we expect a well behaved server to stop sending data. If the server doesn't respect that limit, we'll respond with a fatal error and close the connection: https://github.com/dotnet/corefx/blob/ee57e77c81a18ced855685f519b521e6a34d9821/src/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2Stream.cs#L174-L182
to explore the possibility that always have TrailingHeaders information ready before returning the HttpResponseMessage.
@caesar1995 That's a non-starter for streaming scenarios. Http2 requires some per-stream buffering to multiplex, but not full buffering.
What about bool IsComplete { get; }
?
Agreed with @Tratcher on that one. Our per-stream buffering means that it is possible for us to have read the trailing headers before the client has read the content on the stream, but in the case of a large response that fills the window the user will always have to read some of the content before we can receive the trailing headers.
What about bool IsComplete { get; }?
I feel like the bool IsComplete { get; }
API would encourage developers to just sit on a thread checking the value until it is true, and I don't think that's a good pattern.
@rmkerr how do you differentiate between "the trailers haven't arrived yet" and "there are no trailers"? Or do you assume the component that consumes the content is always going to be the one that reads the trailers?
You are right to worry about the trailers showing up early. At a minimum there's a concurrency issue if you try to populate them on the background connection thread and check for them on the app thread. That lazy initalizer could get you in trouble.
That's a good question, and I'm not actually sure what the answer is.
@caesar1995, do you know how we would handle a server sending the Trailers
header, and then not actually including any trailers? Is that even a realistic concern?
The Trailers response header is only a hint, there's nothing stopping the server from sending any trailers it wants.
@caesar1995, do you know how we would handle a server sending the Trailers header, and then not actually including any trailers? Is that even a realistic concern?
Server sends Trailer
header to inform the recipient to prepare for the metadata. If it sends the Trailer header and doesn't including any trailers, we should treat the response as malformed and TrailingHeaders
field should just return an empty HttpResponseHeaders
collection.
The Trailers response header is only a hint, there's nothing stopping the server from sending any trailers it wants.
I think in this case, we should ignore any unexpected trailing headers - just like ignoring all disallowed trailers. https://tools.ietf.org/html/rfc7230#section-4.1.2
What about bool IsComplete { get; }? how do you differentiate between "the trailers haven't arrived yet" and "there are no trailers"? Or do you assume the component that consumes the content is always going to be the one that reads the trailers?
The component that consumes/buffers the response content is always the one reads(knows) about the trailers information. I don't think we need to differentiate these two cases - by default we return the HttpResponseMessage
after parsing & storing the trailing headers. If developers want to do that themselves (by setting completion option to ResponseHeadersRead
), and they decided to dispose the content before fully read it, the TrailingHeaders
field should just be empty. - We definitely need to document this behavior.
The Trailer header is only a SHOULD.
The client should be loose on accepting trailers other than those explicitly prohibited, regardless of the contents of the Trailer header. Cross checking with the Trailers header would be more work on your part, not less.
If the server sends a Trailer header and does not send the associated trailer, oh well, the app can deal with the missing value.
The Trailer header is only a SHOULD. The client should be loose on accepting trailers other than those explicitly prohibited, regardless of the contents of the Trailer header.
This makes sense. Also the algorithm proposed by the RFC inline with your approach as well. (https://tools.ietf.org/html/rfc7230#section-4.1.3) I will use this approach in implementation. Thanks!
If the server sends a Trailer header and does not send the associated trailer, oh well, the app can deal with the missing value.
Agree.
A few thoughts:
It seems weird that accessing TrailingHeaders before they are available (in the ResponseHeadersRead case) will just return no headers. This makes it easier for a user to be confused and assume there are no trailing headers, when really we just haven't read the whole response yet. Seems like we should throw in this case instead.
As mentioned above, we need to be super clear about when trailers are actually available. I think that trailers should not be available until the user has read the entire response body -- even if we've actually buffered the response body internally and read the trailers. Otherwise, this could lead to inconsistent behavior, where sometimes you get lucky and the trailers become available due to how buffering happened, and sometimes you don't.
Note that this only matters for ResponseHeadersRead; with ResponseContentRead (the default), trailers will always be immediately available.
Similarly, if we add IsComplete or something like that, IsComplete should only become true when the user has completely read the response body, not simply when we have buffered it internally.
That said, I am not convinced that IsComplete is really needed. I think the common pattern here will be:
(1) Issue request and await response (2) Read (non trailing) response headers as needed (3) Process entire response body, to EOF (4) Read trailing response headers as needed
So it's not clear to me what value IsComplete would add. If you rely on IsComplete, you're probably doing something wrong.
We should have an idea for how request trailers would work even if we don't implement them at the same time.
I agree, we should define the API for request headers and ensure that there is some consistency between request and response here. It's fine to not implement the request support for 3.0.
@geoffkizer if TrailerHeaders throws then IsCompleted becomes essential. You can't assume all of the response processing happens in the same place or that all of the context flows. Consider this method that does not have all of the context for the response consumption state.
public void LogResponse(HttpResponseMessage response)
{
// Log status code
// foreach header in response.Headers, log
// foreach header in response.Content.Headers, log
try
{
// foreach header in response.TrailerHeaders, log
}
catch (InvalidOperationException) { } // The trailers aren't available yet
}
vs:
public void LogResponse(HttpResponseMessage response)
{
// Log status code
// foreach header in response.Headers, log
// foreach header in response.Content.Headers, log
if (response.Content.IsCompleted)
{
// foreach header in response.TrailerHeaders, log
}
}
If TrailerHeaders didn't throw then the worst case would be this method failing to log them. If it does throw then this method has to deal with that one way or the other. There should always be an alternative to wrapping an API in a try/catch.
@Tratcher Yeah, that makes sense. I agree that we need a way to determine whether the trailers are available yet, or not, without try/catch.
That said, "IsComplete" isn't a great name for this. Perhaps "TrailingHeadersAvailable"? Or maybe change the API here from a property to bool TryGetTrailingHeaders(out HttpHeaders)?
"TrailersAvailable"?
TryGet is overkill for the common case where you already know they're available and use the Property. The state also only ever changes from negative to positive so atomicity isn't a concern.
It seems weird that accessing TrailingHeaders before they are available (in the ResponseHeadersRead case) will just return no headers. we need a way to determine whether the trailers are available yet, or not, without try/catch.
TrailersAvailable
sounds good to me as well. Added to the description for API review.
I think that trailers should not be available until the user has read the entire response body -- even if we've actually buffered the response body internally and read the trailers.
Yes, I plan to implement this way. If developers set completion option to ResponseHeadersRead
, and they decided to dispose the content before fully read it, the TrailingHeaders
field should just be empty.
@caesar1995 You quoted my comment re throwing when trailers aren't available yet, but you didn't really respond to it -- do you think this should throw or not?
I think TrailingHeaders
should throw because:
TrailersAvailable
to guard/avoid exception try-catch: we can document this pattern in case developers choose to read response body themselves.The throw
will not affect the most common (which is the default ResponseContentRead
, it will never throw), but it's very helpful to provide additional information in the scenarios I described above. @davidsh Do you have any push back on the throw?
@davidsh Do you have any push back on the throw?
I don't think it should throw. It seems to be a violation of standard .NET design guidelines for good API design.
See: https://docs.microsoft.com/en-us/dotnet/standard/design-guidelines/property
AVOID throwing exceptions from property getters. Property getters should be simple operations and should not have any preconditions. If a getter can throw an exception, it should probably be redesigned to be a method.
Are there other examples of a property getter that return a collection that throw an exception?
Are there other examples of a property getter that return a collection that throw an exception?
I'm not sure. I will do some research sometime later.
For now, I will cancel the review for this API tomorrow, since we haven't reached agreement yet. We can have more time discuss the design in the Wednesday meeting.
The summary of today's discussion:
TrailingHeaders
.HttpResponseHeaders
type should be returned to reduce allocation.TrailingHeaders
will be available only after the underneath response body stream has reached EOF.TrailingHeaders
can return null - this means trailing headers are not ready yet (stream hasn't fully read).I will bring this for the API review next Tuesday. cc: @dotnet/ncl @Tratcher @stephentoub
In the case where we know in advance that there is no trailing headers
Not exactly. At some point we need to set the field that backs the TrailingHeaders property, and the idea is we set that to either a collection containing all the read headers or to a singleton read-only collection if there were none. It's not really a case of knowing in advance. In pseudo-code just mocking up the general flow, you'd have:
TrailingHeadersCollection c = null;
foreach (string line in ...)
{
if (c == null) c = new TrailingHeadersCollection();
c.Add(Parse(line));
}
_trailingHeaders = c ?? TrailingHeadersCollection.Empty;
What will code look like if the caller wants to wait until trailers are available? Read the stream until it is complete?
Read the stream until it is complete?
If they've used ResponseHeadersRead rather than ResponseContentRead, yes, e.g. await stream.CopyToAsync(Stream.Null).
we should treat the response as malformed and TrailingHeaders field should just return an empty HttpResponseHeaders collection.
Is it really wise to just suppress protocol errors? The RFC allows for suppressing and for treating it as an error. Making it an error helps with debugging.
Normally, client code should know by the structure of the code whether the response has been read already or not. I don't see why someone would want to test whether we have arrived at the trailer section in the stream or not.
Accessing trailers at a point where it is logically impossible to have them (before the entire response has been read) is an API usage error and should throw. The property should not return null
(or even worse an empty collection), IMO.
If you could test for trailers present what would you do if the test returned false? You have no way to mitigate that condition. You must structure the code so that trailers are available at the point of access by construction.
Normally, code would look like this:
Simply by following this sequence it is clear when it is permissible to access trailers.
Is it really wise to just suppress protocol errors?
Forgot to update this thread, we will throw for protocol errors. For the most recent discussion on the issue, please see the in progress PR here: dotnet/corefx#35337.
Accessing trailers at a point where it is logically impossible to have them (before the entire response has been read) is an API usage error and should throw.
We don't throw for property getters, it's a violation of standard .NET design guidelines for good API design: https://docs.microsoft.com/en-us/dotnet/standard/design-guidelines/property
If you could test for trailers present what would you do if the test returned false? You have no way to mitigate that condition.
Are you asking the no trailers present case? If client doesn't expect any trailers, then it will not need to access the TrailingHeaders
. If it does, the TrailingHeaders
is lazy initialized when you access it, an empty collection will be returned.
Simply by following this sequence it is clear when it is permissible to access trailers.
Of course. But developer can set HttpCompletionOption.ResponseHeadersRead
, and take over the control for step 2 & 3. We explore many approaches here and dotnet/corefx#35337.
Here is the new summary:
TrailingHeaders
is needed.
a. This property does NOT have a setter.TrailingHeaders
is lazy initialized, similar approach with Headers
property.
a. Need to check disposed
to avoid connection is drained/returned to the pool before fully read the response.
b. When TrailingHeaders
is not ready, we will return an empty collection – not Null.When TrailingHeaders is not ready, we will return an empty collection – not Null.
Properties should rather throw than provide incorrect values. Returning an empty collection is an incorrect value that will lead to incorrect further computation. It's a nasty pitfall for developers.
Returning a default value does not solve any problem. Rather it obscures a usage error by turning of the warning (the exception).
X AVOID throwing exceptions from property getters.
This is a sensible guideline but it's just a guideline. Many framework properties throw if they cannot do the job that is being asked (e.g. Stream.Length
. Should we rather make Stream.Length
return 0 or -1? Of course not. We must throw. Or, return (int?)null
.). Sometimes throwing is the best overall design.
This way we do not have to expose another property TrailersAvailable
as proposed earlier in https://github.com/dotnet/corefx/issues/34912#issuecomment-460509668. Using try-catch to detect if trailers are available would have perf impact and it would violate general design of exceptions to be exceptional situations.
Given that TrailingHeaders feature is corner case feature almost nobody uses anyway (except gRPC) and given that we will throw/return empty collection, only if user uses specific approach via HttpCompletionOption.ResponseHeadersRead
(corner case of a corner case), we believe this design is ok.
And it has a nice side-effect of following API guidelines by not throwing from property getters :)
I understand the problems with the alternative designs. But returning a wrong result is the worst of all possible choices. In my opinion, this is an egregious design error that we should not add to the Framework (and keep it there forever for compat).
I have no skin in the game here and I certainly respect your right to make this decision. But I will make one more attempt to convince the team:
Stream.Length
and many others).null
.If this does not convince you I give up :laughing:.
@GSPP Thank you for your feedback on the API design. Many of the issues you cite above were discussed in the API review meeting. I think those meetings are recorded and publicly available so you can view our discussions.
Throwing in a property is entirely appropriate and acceptable if the property can not do it's job (see Stream.Length and many others). I recommend returning null. It allows for detection and any usage error will be noticed quickly.
We did not want to throw an exception nor return null
. The reason is that the HttpClient API family (including HttpRequestMessage and HttpResponseMessage) is agnostic with respect to the handler HTTP stack (i.e. SocketsHttpHandler for example). We allow server frameworks to consume, use and populate HttpResponseMessage objects for example. Throwing an exception or returning null
from doing a get of the HttpResponseMessage.TrailingHeaders
property would prevent server-side software from constructing the TrailingHeaders collection. It would also prevent mocking frameworks (i.e. using dependency injection) as well.
@davidsh OK, that reason makes sense. I guess this overrides the concerns that I had. Thanks for responding.
Will HttpResponse.TrailingHeaders
be in .NET Standard 2.1?
cc @terrajobst
It would force all independent implementers to implement it ... not triavial thing to ask for IMO. AFAIK the right process is to file an issue in dotnet/standard repo ...
@karelz They'd only need to provide the API which is trivial. It would still be optional for the the various handlers to implement the functionality. It wasn't implemented here for Curl or WinHttp.
Feels weird to expose something that is not supported though. We got danged for Standard APIs not working everywhere by community, it is confusing for them. I let @terrajobst guide us here - it is more about philosophy of standard, not about these specific APIs. Please create the request on standard repo. We should have the discussion there.
gRPC (Google RPC) protocol uses HTTP trailers to send/receive metadata. On the client side, SocketsHttpHandler needs to implement support for trailing headers to receive metadata.
API Proposal
To support trailing headers come with the response message, we need a new Property in
HttpResponseMessage
class.Usage
Trailers are part of the response body (chunked message), the information stored in
TrailingHeaders
field will be ready for developer after the underneath response stream is read to the EOF.By default, the
HttpCompletionOption
for response isResponseContentRead
, which indicates HttpClient operations (Get/Send) will be considered completed after reading the entire response message including the content. Therefore, we will always have the trailing headers information ready in this case.However, if developer wants to get response as soon as it's available (
HttpCompletionOption.ResponseHeadersRead
), he needs to issue an explicit read on the response stream to the EOF to get the trailing headers. If theTrailingHeaders
is not ready, we will return an empty collection.Sample Code
HTTP2
The TE header on the request can only contains
trailers
value. (RFC 7540)No additional API is needed for HTTP2.
cc: @dotnet/ncl @geoffkizer @stephentoub