Closed dret closed 7 years ago
It could also mean:
another aspect to consider here is that is a link parameter is unknown, it could be repeatable. what would that mean for a model where unknown link parameters are not dropped? it might be something like requiring serialization into some delimiter-separated list within the string-value, such as whitespace, comma, or semicolon.
On 2017-08-15 14:18, Herbert Van de Sompel wrote:
It could also mean:
- pass around what you have received if you are a syndicator
- ignore what you don't understand if you're actually consuming
but that's what's happening anyway. if i am consuming and need to understand a link parameter, then i am of course ignoring the ones that i do not understand.
but if my job in the pass-around role is to pass around links that i see in link headers, and i am passing them around in their JSON serialization, how do i do this?
Then you need to understand, because you consume prior to re-serializing.
On 2017-08-15 14:29, Herbert Van de Sompel wrote:
Then you need to understand, because you consume prior to re-serializing.
so you define an intermediary as a component that does not parse or serialize? that's a bit unusual. in that case the notion of an intermediary is not very helpful. regardless of terminology, what you're saying is that such a component (regardless of its name) that is tasked with serializing HTTP Link header fields would not be allowed to serialize any unknown link parameters. that would make it impossible to have generic components for monitoring, logging, and other tasks that otherwise could benefit from the JSON serialization.
Do you need to serialize what you have parsed (and understood)? Isn't it possible to parse, deserialize and then just pass the untouched source on to the next processor? Then the parser can understand whatever it needs to in order to do what it wants without affecting other processors and possibly screwing up by removing or wrongly serializing unknown parameters.
Just for the record: I don't know where the language about ignoring unknown attributes in the I-D came from; I don't remember adding it. But, honestly, I think it makes a lot of sense. Let's not forget we are talking about third party resources that pass on links that pertain to other resources. Maybe they should know what they are doing? For example, in most cases, they will need to add anchor attributes because the links don't pertain to themselves.
I think that the interesting bit about the suggestion I made re handling repeatable attributes in JSON is that it is immediately clear from a payload what is repeatable and what is not. So, if we can assume that a resource that uses repeatable attributes serializes appropriately in JSON, then downstream applications can serialize appropriately both in JSON and in the native format. Unfortunately, not the other way around (native to JSON) because the native format has no way to express whether attributes are repeatable. In this case, the consuming application needs to be in the know to transform from native to JSON.
On 2017-08-15 14:50, Asbjørn Ulsberg wrote:
Do you need to serialize what you have parsed (and understood)? Isn't it possible to parse, deserialize and then just pass the untouched source on to the next processor? Then the parser can understand whatever it needs to in order to do what it wants without affecting other processors and possibly screwing up by removing or wrongly serializing unknown parameters.
let's chat at RESTfest! my current concern is about a scenario where an intermediary monitors HTTP traffic, looks at Link headers, and is attempting to serialize those as JSON so that they can be accessed as JSON data. the question is: can unknown link parameters be represented in JSON? one option is to say "no". another option is to say "yes", but then there needs to be a definition of the "how" as well, and one that does not require knowledge of specific link parameters.
That is indeed a pertinent issue, see my above comment.
On 2017-08-15 14:52, Herbert Van de Sompel wrote:
Just for the record: I don't know where the language about ignoring unknown attributes in the I-D came from; I don't remember adding it. But, honestly, I think it makes a lot of sense. Let's not forget we are talking about third party resources that pass on links that pertain to other resources. Maybe they should know what they are doing? For example, in most cases, they will need to add anchor attributes because the links don't pertain to themselves.
we did change the focus of the media types to be scenario-agnostic (#70), so this discussion is about any usage of the media types.
I think that the interesting bit about the suggestion I made re handling repeatable attributes in JSON is that it is immediately clear from a payload what is repeatable and what is not.
that's true for JSON but not true for native. so if i am attempting to serialize data i received via native into JSON, then i won't know whether that value is repeatable or; all i can tell from looking at a link is whether the value is repeated.
So, if we can assume that a resource that uses repeatable attributes serializes appropriately in JSON, then downstream applications can serialize appropriately both in JSON and in the native format. Unfortunately, not the other way around (native to JSON) because the native format has no way to express whether attributes are repeatable. In this case, the consuming application needs to be in the know to transform from native to JSON.
yes, and that's at the heart of this issue. in 99.9% of cases on the web, the starting point will be native (when monitoring web traffic and looking at headers), so this is an important part of the puzzle.
just dropping in here...
JSON's limits on the uniqueness of key
elements is something i deal w/ often in designing representations. what i do is create anonymous objects that can appear within an array.
for example:
{
...
"params": [
{"name":"url","value":"..."},
{"name":"rel","value":"..."},
{"name":"x","value":"123"},
{"name":"x","value":"abc"},
]
...
}
this approach can be applied to all parameters (above) or just the "unknown" parameters":
{
"url" : "...",
"rel" : "...",
"params": [
{"name":"x","value":"123"},
{"name":"x","value":"abc"},
]
...
}
Very interesting. Thanks for the insight, @mamund
@hvdsomp
no problem.
also, FWIW, i think it is wise to follow the HTTP proxy processing pattern (from RFC1945 Unrecognized header fields should be ignored by the recipient and forwarded by proxies. here.
it's possible that you many be thinking more along the lines of HTML: “The HTML parser will ignore tags which it does not understand, and will ignore attributes which it does not understand…”
however, i think what you're implementing here is more likely to be used by intermediaries (proxies) rather than clients (e.g HTML browsers).
so, i would not drop things we don't understand -- just pass them along "as-is".
On 2017-08-15 16:21, Mike Amundsen wrote:
@hvdsomp https://github.com/hvdsomp also, FWIW, i think it is wise to follow the HTTP proxy processing pattern (from RFC1945 /Unrecognized header fields should be ignored by the recipient and forwarded by proxies./ here.
it's possible that you many be thinking more along the lines of HTML: /“The HTML parser will ignore tags which it does not understand, and will ignore attributes which it does not understand…”/
however, i think what you're implementing here is more likely to be used by intermediaries (proxies) rather than clients (e.g HTML browsers).
good references! and yes, when things are not forwarded then we don't really have to think about how/if they are represented. but if they are forwarded, that's what we're discussing here. i'd definitely prefer the HTTP behavior here (we're talking about HTTP link headers after all), instead of mandating to ignore unknown fields.
Sorry to repeat myself, but links will in most scenarios not just be forwarded because the link context will change when they are being forwarded. Only for links that have explicit link context and link target would it be possible to blindly forward.
On Aug 16, 2017, at 00:27, Erik Wilde notifications@github.com wrote:
On 2017-08-15 16:21, Mike Amundsen wrote:
@hvdsomp https://github.com/hvdsomp also, FWIW, i think it is wise to follow the HTTP proxy processing pattern (from RFC1945 /Unrecognized header fields should be ignored by the recipient and forwarded by proxies./ here.
it's possible that you many be thinking more along the lines of HTML: /“The HTML parser will ignore tags which it does not understand, and will ignore attributes which it does not understand…”/
however, i think what you're implementing here is more likely to be used by intermediaries (proxies) rather than clients (e.g HTML browsers).
good references! and yes, when things are not forwarded then we don't really have to think about how/if they are represented. but if they are forwarded, that's what we're discussing here. i'd definitely prefer the HTTP behavior here (we're talking about HTTP link headers after all), instead of mandating to ignore unknown fields. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
On 2017-08-15 16:43, Herbert Van de Sompel wrote:
Sorry to repeat myself, but links will in most scenarios not just be forwarded because the link context will change when they are being forwarded. Only for links that have explicit link context and link target would it be possible to blindly forward.
any kind of HTTP monitoring solution will be interested in links. these will be captured within the context of the HTTP request.
any kind of HTTP proxying solution forwarding links from native to JSON will rewrite links to add anchors, iff the request context changes. if it does and the rewriting happens, it would be odd if we required to drop some link parameters from the original data.
just as a reminder: if we make ignoring unknown link parameters mandatory, that means we're creating a representation that critically depends on schema knowledge. that's neither a popular nor a usually successful design on the web. in practical terms:
Link: </>; rel="http://example.net/foo"; this="x"; might="y"; be="z"; important="right?"
then must be serialized like this:
{"href":"/","rel":"http://example.net/foo"}
unless the serializer knows all the link parameters and they all have defined JSON serializations.
Ok, so the problem at hand is that information is lost when converting from Link
to JSON and back again. What I don't really understand is that this is something someone wants to do.
I can understand the creation of a JSON representation of the same abstract model as that of Link
, but why will the two concrete serializations be used by a single processor? What is the use-case of converting from Link
to JSON and back again?
On 2017-08-16 04:04, Asbjørn Ulsberg wrote:
I can understand the creation of a JSON representation of the same abstract model as that of |Link|, but why will the two concrete serializations be used by a single processor? What is the use-case of converting from |Link| to JSON and back again?
a rather typical use case we are looking at is that you have an HTTP-focused component such as a proxy, which captures and exposes HTTP information to JSON-focused components such as applications that want to analyze and work on HTTP traces and logs without the need to parse specific syntaxes.
if our representation is categorically unable to represent links without schema-awareness, we make it less useful. i have to admit that i am struggling a bit here to understand the downsides of a schema-agnostic representation (other than opportunities for schema-specific JSON optimization).
exposes HTTP information to JSON-focused components such as applications that want to analyze and work on HTTP traces and logs without the need to parse specific syntaxes.
Ok. This requires conversion from Link
to JSON. But not back again from JSON to Link
, am I correct?
On 2017-08-16 14:43, Asbjørn Ulsberg wrote:
Ok. This requires conversion from |Link| to JSON. But not back again from JSON to |Link|, am I correct?
true for this scenario. i don't have such a use case somewhere now, but i could easily imagine tooling that accepts JSON-structured links (which might be more convenient for programmers to manage) and then injects those into HTTP as link headers.
such tooling could even do this on a case-by-case basis as @hvdsomp's scenarios: create link headers if there are not too may links, or add a "linkrel" link and provide as application/linkrel+json resource if there are too many links for inline representation in the header.
Yes, but that will only require conversion from JSON to Link
, right? It's only the full roundtrip that is problematic, afaict, and I can't think of a good use-case for a full roundtrip of the two serializations.
On 2017-08-16 14:59, Asbjørn Ulsberg wrote:
Yes, but that will only require conversion from JSON to |Link|, right? It's only the full roundtrip that is problematic, afaict, and I can't think of a good use-case for a full roundtrip of the two serializations.
same here, for now. but if we have reasonable scenarios for both directions, that gives us a good starting point, or not? what would be changed if we had one scenario requiring a full roundtrip?
The lossless round trip is enabled by the serialization proposal I submitted earlier today. I am not sure what the problem is anymore.
suggested resolution for this issue:
as initially suggested, the representation does not require schema information and there is no requirement to ignore unknown parameters.
all parameters are therefore treated uniformly.
Agreed. As far as I am concerned this issue can be closed.
https://github.com/dret/I-D/issues/74#issuecomment-322427054 suggests to require ignoring unknown link parameters, and this seems to be important enough to discuss as a standalone issue. it seems like this would seriously impact the way how links can be passed around by intermediaries. for example, an intermediary with the job of serializing HTTP
Link
headers would have to drop all link parameters that it does not know, instead of being able to serialize them. furthermore, only link parameters that have a defined JSON serialization could be serialized, all others by definition would have to be dropped. this seems like a rather drastic departure from the usual web model of passing information around, such as for example HTTP's requirement to pass on headers (instead of silently dropping them).