dret / I-D

Internet Drafts I've authored or contributed to.
16 stars 13 forks source link

Put the linkset array inside an object #103

Closed BigBlueHat closed 4 years ago

BigBlueHat commented 6 years ago

Using an array as your top-level container cuts of the opportunity to add information about all the items in the array, it's provenance, extensions, etc. It greatly limits what's possible.

Consider reshaping:

   [ { "href"   : "http://example.com/foo",
       "anchor" : "http://example.net/bar",
       "rel"    : [ "next" ] },
     { "href"   : "http://example.com/foo",
       "anchor" : "http://example.net/bar",
       "rel"    : [ "http://example.net/linkrel" ] } ]

to:

{
  "linkset":
   [ { "href"   : "http://example.com/foo",
       "anchor" : "http://example.net/bar",
       "rel"    : [ "next" ] },
     { "href"   : "http://example.com/foo",
       "anchor" : "http://example.net/bar",
       "rel"    : [ "http://example.net/linkrel" ] } ]
}

This opens the door to far more extensibility and clarity than the current "direct array" format.

Cheers! 🎩

dret commented 6 years ago

On 2018-08-22 16:38, BigBlueHat wrote:

Using an array as your top-level container cuts of the opportunity to add information about /all/ the items in the array, it's provenance, extensions, etc. It greatly limits what's possible. This opens the door to far more extensibility and clarity than the current "direct array" format.

thanks for the suggestion! i think i like that idea. @hvdsomp, what are your thoughts?

csarven commented 5 years ago

Just jumping in to agree with @BigBlueHat 's proposal being a bit more useful - easier to create, manipulate.

hvdsomp commented 5 years ago

Yes, I am definitely in support of this too.

dret commented 5 years ago

i agree in spirit, but i am hesitant about defining or even allowing other members in the top-level object. i do see use cases where this could be useful, but this would also make the two serializations not aligned anymore, as the native model does not have such a "top-level container". the problem is that if applications add top-level metadata that change the semantics of the linkset, then it becomes implementation-dependent what the linkset represents. so we should mandate that any additional info in the top-level element cannot change the semantics of the linkset.

BigBlueHat commented 5 years ago

the native model does not have such a "top-level container".

Don't Link and <link> implicitly rely on their "top-level container"s? The surrounding context for much of their data and value typical comes from the surrounding HTTP message or HTML document: dictated by things like Request-URI or <base>.

Consequently, this linkset document has very little value currently because it's contextual use isn't knowable:

   [ { "href"   : "/foo",
       "anchor" : "/bar",
       "rel"    : [ "next" ] } ]

Accordingly the current linkset draft that should be valid, but relies on "baseURI"...which can't be expressed in the current format (afaict):

  The value of both the "href" and "anchor" members MUST be a URI-
  reference as defined in Section 4.1 of [RFC3986].  Note that for
  relative references, the baseURI is the URI of the link set
  resource.

So, if I wanted to state the baseURI of the document I'd extracted that linkset from where would I put it? I'd either have to absolutize all the URIs (which ultimately means there's a new requirement for linksets to only contain absolute URIs), or I need a place to encode it for the whole document...which is where the top-level object would come in real handy. 😁

FWIW, JSON-LD already has such a handy way to provide this specifically, so you could express the above as:

{
  "@context": {
    "@base": "http://example.com/page-i-pulled-links-from"
  },
  "linkset": [ { "href"   : "/foo",
       "anchor" : "/bar",
       "rel"    : [ "next" ] } ]
  }
}

Also, an extensibility provision could explain what should and should not be changed about the required terms and their usage. Since you have a dedicated media type for these documents, you could further enforce that processing and any extensibility requirements against the defined processing steps.

Anyhow. Extensible top-level objects are a Good Thing for all. 😸

dret commented 5 years ago

Don't |Link| and || implicitly rely on their "top-level container"s? The surrounding context for much of their data and value typical comes from the surrounding HTTP message or HTML document: dictated by things like |Request-URI| or ||.

yes of course, which is why relative URIs make sense for them. they are by definition part of a bigger construct that is establishing their their context.

Consequently, this linkset document has very little value currently because it's contextual use isn't knowable:

[ {"href"    :"/foo",
    "anchor"  :"/bar",
    "rel"     : ["next"  ] } ]

it is valuable if you know its context (or at least its base URI). but yes, if you take the context away, there's no way to resolve the URI.

Accordingly the current linkset draft https://tools.ietf.org/html/draft-wilde-linkset-03#section-4.2.2 that should be valid, but relies on "baseURI"...which can't be expressed in the current format (afaict):

|The value of both the "href" and "anchor" members MUST be a URI- reference as defined in Section 4.1 of [RFC3986]. Note that for relative references, the baseURI is the URI of the link set resource. |

i'd argue that the main goal of linksets is to be truthful representations of RFC 8288. since RFC 8288 starts from the assumption that there is a context, linksets inherit that assumption. if you want to use them, you have to reflect that assumption, and that may mean that when managing them, you might have to manage their context (the base URI) with them.

So, if I wanted to state the |baseURI| of the document I'd extracted that linkset from where would I put it? I'd either have to absolutize all the URIs (which ultimately means there's a new requirement for linksets to only contain absolute URIs), or I need a place to encode it for the whole document...which is where the top-level object would come in real handy. 😁

i think i understand your point. personally, my preference would be to stick to just representing what RFC 8288 represents. that means that you might have to manage the linkset's context externally.

i'd argue that's not something new here. for example, if you build HTTP logging that logs individual header fields, you have to do the same: for Link header fields, you'd have to keep its context info around, or you cannot interpret relative URIs. in that case, too, you probably wouldn't change the link header field and somehow stick the base URI in there. more likely you'd store an association between the header field and its context, which then would allow you to resolve URIs, if needed.

Anyhow. Extensible top-level objects are a Good Thing for all. 😸

yes, but only if they don't change the semantics of what's in there. robust extensibility needs "must ignore" processing rules.

BigBlueHat commented 5 years ago

Link and <link> have no value outside of their contexts (afaict), so building a JSON format which encodes only their usage, but not their context seems insufficient. If the objective is to encode all of RFC 8288, then having the option (at least) to encode the contextual data (baseURI, etc), seems invaluable to this specification's purpose.

The JSON format seems to exist precisely to express things "out of context" (and link to that context via some other link with rel="linkset"), so if that JSON format does not encode the necessary contextual data won't that pragmatically make the JSON format's expression of the linkset model restricted to absolute URLs and ones that always include anchor? Otherwise, if a request were made for the linkset resource (or if that resource were just on disk somewhere) it would loose all its meaning and therefore value...would it not?

dret commented 5 years ago

On 2019-03-12 17:01, BigBlueHat wrote:

|Link| and || have no value outside of their contexts (afaict), so building a JSON format which encodes only their usage, but not their context seems insufficient. If the objective is to encode all of RFC 8288, then having the option (at least) to encode the contextual data (baseURI, etc), seems invaluable to this specification's purpose.

your preference is noted.

The JSON format seems to exist precisely to express things "out of context" (and link to that context via some other link with |rel="linkset"|), so if that JSON format does not encode the necessary contextual data won't that pragmatically make the JSON format's expression of the linkset model restricted to absolute URLs and ones that always include |anchor|? Otherwise, if a request were made for the linkset resource (or if that resource were just on disk somewhere) it would loose all its meaning and therefore value...would it not?

that depends. if you send around linksets without their context, then yes. if you either provide the context somehow, or rewrite linksets when an application thinks that's useful, then no.

i think we have established a good understanding of what your preference is and why. @hvdsomp, do you have an opinion on this matter?

hvdsomp commented 5 years ago

Note that in the most recent draft of the I-D, here in GitHub, a requirement was introduced to make all URIs absolute. That was basically an executive decision (by me), lacking input on #117 Obviously, the decision/proposal is open for discussion.

dret commented 5 years ago

On 2019-03-13 11:13, Herbert Van de Sompel wrote:

Note that in the most recent draft of the I-D, here in GitHub, a requirement was introduced to make all URIs absolute. That was basically an executive decision (by me), lacking input on #117 https://github.com/dret/I-D/issues/117 Obviously, the decision/proposal is open for discussion.

would it be fair to say that @bigbluehat's proposal is just another way of achieving the goal of #117? if so, i think i lean towards @bigbluehat's, since it doesn't require users to rewrite the whole linkset, and to change it from its native representation.

should we open a separate issue for that proposal? i think we should, this issue (#103) is not really about URIs and their resolution.

hvdsomp commented 5 years ago

Could this just be done under #117? The problem statement over there lists expressing a baseURL as an option. Note that this option will not work for the application/linkset format. So, I am not sure I agree it is the better approach.

dret commented 5 years ago

On 2019-03-13 14:20, Herbert Van de Sompel wrote:

Could this just be done under #117 https://github.com/dret/I-D/issues/117? The problem statement over there lists expressing a baseURL as an option. Note that this option will not work for the application/linkset format. So, I am not sure I agree it is the better approach.

true. all depends on how much you value roundtrip-ability against the ability to be context-free when possible. let's move over to #117.

dret commented 4 years ago

we are now using an object as the top-level construct.