json-ld / json-ld.org

JSON for Linked Data's documentation and playground site
https://json-ld.org/
Other
859 stars 152 forks source link

JSON-LD 1.1 Feature Request : define how to specify the json-ld profile in a request to a server and include framing as an option #491

Closed lisp closed 6 years ago

lisp commented 7 years ago

the current descriptions provide means to specify encoding and framing options through a programmatic api. these include the context to apply, whether to frame the result. they also provide means in http headers to specify the variant which applies to the given document in terms of a link to a context and/or a media type profile.

the 1.0 descriptions do not appear to provide a way to specify that same information to a json-ld source as part of an http request. is this now included somewhere in 1.1 document? are there plans to provide for this?

gkellogg commented 7 years ago

You can see the current Editor's Drafts of the specs here (note, RDF API is archaic).

The IANA Considerations describe values for a profile parameter, but do not provide a mechanism to provide a context to use for compacting. There is a mechanism for providing a context to use when expanding, but only if the content type is not application/ld+json (see Interpreting JSON as JSON-LD), and that is for a reply, not a GET.

Framing was not part of 1.0, so was not included as a possible profile; expect it to be included for 1.1.

I don't recall any discussions we may have had about providing such information for retrieving JSON-LD documents, but there may be something buried in the closed issues. Is this what you're referring too?

lisp commented 7 years ago

I don't recall any discussions we may have had about providing such information for retrieving JSON-LD documents, but there may be something buried in the closed issues. Is this what you're referring too?

yes, the issue is, how to specify in an http request to a sparql endpoint everything which is required in order to effect and control framing. i found no mention in any issue - open or closed.

as an interim measure, we have been using the same mechanism as served to report the state of the response, but that has no actual basis in the recommendations.

gkellogg commented 7 years ago

It seems to me that there are three ways in which a request might include the location of a context to use for compacting or frame to use for framing:

Of these, I think an informative description of doing this in a query parameter is probably safest, and allows it to be specified using CURL or WGET more easily. But, this is an area the spec cannot normatively address.

The second possibility seems consistent with the use of a profile parameter, but I recall that the discussions around this were complicated, it it may be more challenging than we thing. Also, considering that a service actually needs to dereference the URIs and use them in server-side processing, there are greater security implications. Perhaps @msporny @dlongley @lanthaler or @niklasl have better recollection about these discussions for 1.0.

The last option of using a Link relation seems inconsistent with normal REST principles, so I would not recommend that.

@lisp please share your experience doing this with Dydra, the specifics of your "interim measure" and what your experience with that has been.

lisp commented 7 years ago

the last option, the link header, is what we actually implemented, but upon consultation with the customer who uses this the most, i was informed that, absent a "standard" mechanism, they reframe everything in the client. that is, we frame it once - without a context and they frame it again to effect the name substitutions, on their end. seems very wasteful.

we did not put it in the url as a query argument, as the context is not a semantic component of the addressed resource. we did not add a media type parameter, but would not be averse to that approach, as the context is an elaboration of the media type designation.

Conal-Tuohy commented 7 years ago

@lisp do you allow clients to request any frame at all? i.e. are the acceptable values of the header constrained, e.g. with a "same domain" policy?

lisp commented 7 years ago

we see that as on the same order as a service location: if the host requires authorization, it's just another external resource.

Am 27.05.2017 6:20 vorm. schrieb "Conal Tuohy" notifications@github.com:

@lisp https://github.com/lisp do you allow clients to request any context at all? i.e. are the acceptable values of the header constrained, e.g. with a "same domain" policy?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/json-ld/json-ld.org/issues/491#issuecomment-304424514, or mute the thread https://github.com/notifications/unsubscribe-auth/AALs7vJOMJBeCA88PuCYxcDFbVxMzgDzks5r95q0gaJpZM4NdglK .

Conal-Tuohy commented 7 years ago

My question is really this: if I were to send a request to your JSON-LD server asking for a response to be framed using a resource http://example.com/some-arbitrary-frame-which-might-be-multiple-gigabytes-in-size, would your current implementation attempt to use that frame, or would you (for instance) reject it because it was not in a white-list of acceptable frames?

lisp commented 7 years ago

indirectly, yes, it could be rejected, but not because there is a white list of frames. a service which enforces access authorization can signal an error if the respective repository's account does not have read access for the remote resource. that is, in the sense of the question, resources are "white-listed".

whether the size matter is another issue. unless we arrive at a point where particular aspects of query processing are metered, we could limit the size of a context, but that would not really matter. if the context were "too large", the respective query processor would exhaust its memory and exit. this is no different than the consequences of a service request response of excessive size or a cross-join of a single-subject repository. that request fails and the next proceeds with a new process.

mongoponzo commented 7 years ago

if I were to send a request to your JSON-LD server asking for a response to be framed using a resource http://example.com/some-arbitrary-frame-which-might-be-multiple-gigabytes-in-size

My understanding is that you do not pass a URL to the frame, but instead you pass the frame itself. So this issue should already be handled by the webserver (or any proxy before it).

lisp commented 7 years ago

the question does not concern the programmatic api. it relates to a json-ld stage which is integrated into a sparql processor, which means the question is exactly how that sparql server handles the protocol-level issue.

Conal-Tuohy commented 7 years ago

Yes the HTTP interface is exactly what I'm interested in; whether that's for a SPARQL query processor or indeed for any kind of "linked data" server, such as an LDP server. I am curious about current implementations of this (currently unspecified) part of the protocol, which is really a particular kind of HTTP Content Negotiation. It had struck me that allowing a client to specify a frame by URI reference would present a new DoS attack surface, so my curiosity was about whether implementers had made any attempts to mitigate that risk.

NB SPARQL query processors will often provide an analogous system for XML-based serializations of RDF, in which they will accept a URI parameter from the client which refers to an XSLT stylesheet to be used for re-writing the result (and which is roughly equivalent to JSON-LD framing), but such XSLT processing is always done client-side; the server simply returns a result with an embedded XML processing instruction which refers to the XSLT. Doing the XSLT processing server side would impose a DoS risk on the server.

lisp commented 7 years ago

so, i tried to make it clear in my answer that each and every sparql request is in and of itself a dos vector. does the context url add something new to that risk? nb. we support also server-side xslt processing.

Conal-Tuohy commented 7 years ago

Thanks @lisp -- I think the context URL issue is more significant for protocols other than SPARQL Query. I take your point that the additional costs of allowing a client to specify a frame for a SPARQL query response is small compared to the costs of the query itself, including embedded service requests. Anyone who operates a SPARQL query server will already have to deal with DoS issues, so they'll be able to deal with the potential abuse of frames in the same way; quotas, timeouts, etc.

I understand that you do provide a mechanism (via access control lists) which could be used to administratively prohibit a user or class of user from using an external frame, and I'm guessing you can also constrain server-side XSLT processing similarly.

However, what I'm wondering about is other implementations of "HTTP framing negotiation" (for want of a better name), outside the scope of the SPARQL query protocol, e.g. in the SPARQL GSHP, LDP, Triple Pattern Fragments, etc, and especially at scale in a totally open environment ("Linked Open Data"). I'm wondering how wary I should be about the potential for abuse if it were implemented with those protocols.

azaroth42 commented 7 years ago

As background, this seems very similar in nature to GraphQL functionality.

Given that the profile can be any URI, couldn't you define just check for a profile of a known frame?

gkellogg commented 7 years ago

As a potential attack vector, I don't think specifying a frame via an HTTP header creates any different exposure to providing a context via a Link header, or as embedded within an HTML document.

It's reasonable for implementations to whitelist such frame IRIs, IMO.

akuckartz commented 7 years ago

Maybe @rubenverborgh has something to say regarding this issue?

RubenVerborgh commented 7 years ago

Yes, we're working on profile-based negotiation right here: http://profilenegotiation.github.io/I-D-Accept--Schema/I-D-accept-schema.txt (source: https://github.com/ProfileNegotiation/I-D-Accept--Schema/)

Would be very good to have this aligned.

jaw111 commented 7 years ago

I can imagine that the ability to send a context/frame as the HTTP request payload would be handy in some cases (like during development), but does not seem like the way to go for production use. Personally I would much prefer the ability to provide the name (URI) of the context/frame in HTTP request headers. Ideally the server would cache any context/frame (provided robust cache invalidation mechanism) and would try to fetch any unknown context/frame dynamically.

@RubenVerborgh in the draft you use angle-brackets <> to delimit the URIs, whereas previously I had seen double-quotes " used.

Accept: application/ld+json; profile=<http://www.w3.org/ns/json-ld#framed>

vs.:

Accept: application/ld+json; profile="http://www.w3.org/ns/json-ld#framed"

Is the difference intentional?

RubenVerborgh commented 7 years ago

~I see angular brackets here~ I was wrong: https://www.w3.org/TR/activitystreams-core/#media-type

jaw111 commented 7 years ago

Hmmm, mixed messages: https://www.w3.org/TR/json-ld/#iana-considerations

Please note that, according HTTP11, the value of the profile parameter has to be enclosed in quotes (") because it contains special characters and, if multiple profiles are combined, whitespace.

From RFC2616:

Accept                  = "Accept" ":"
                          #( media-range [ accept-params ] )

media-range             = ( "*/*"
                          | ( type "/" "*" )
                          | ( type "/" subtype )
                          ) *( ";" parameter )
parameter               = attribute "=" value
attribute               = token
value                   = token | quoted-string
quoted-string           = ( <"> *(qdtext | quoted-pair ) <"> )
RubenVerborgh commented 7 years ago

I misread, should indeed be quotes!

gkellogg commented 7 years ago

A PR that addresses the profile request including external frame or context would be appreciated.

gkellogg commented 7 years ago

@RubenVerborgh My takeaway from looking at Negotiating Profiles in HTTP is that JSON-LD could use this in addition to or as an alternative to the profile mime-type parameter. For example:

GET /compacted-document.jsonld HTTP/1.1
Host: example.com
Accept: application/ld+json
Accept-Profile: <http://www.w3.org/ns/json-ld#compacted https://example.com/context.jsonld>

====================================

HTTP/1.1 200 OK
...
Content-Type: application/jld+son
Profile: <http://www.w3.org/ns/json-ld#compacted https://example.com/context.jsonld>

{
  "@context": "https://example.com/context.jsonld"
  "name": "Markus Lanthaler",
  "homepage": "http://www.markus-lanthaler.com/",
  "image": "http://twitter.com/account/profile_image/markuslanthaler"
}

This could also be used with the profile <http://www.w3.org/ns/json-ld#framed https://example.com/frame.jsonld> to request a particular frame. A server may chose to reject this, if the frame does not meet some criteria, such as white-listing.

Is that consistent with your recommendations? Your proposal would seem to allow either singleton profile URIs or pairs, such as used above. How would one distinguish between two different profiles, and a single profile pair?

In discussion with @ericprud, it seems that there might be overlapping uses for Profile. which might allow negotiating for a vocabulary variant (say FOAF vs. schema.org), in addition to the frame or context. This seems to be anticipated in Negotiating Profiles in HTTP.

RubenVerborgh commented 7 years ago

My takeaway from looking at Negotiating Profiles in HTTP is that JSON-LD could use this in addition to or as an alternative to the profile mime-type parameter.

Indeed, that would be great!

Is that consistent with your recommendations?

Roughly, yes. The syntax will probably be slightly different.

Your proposal would seem to allow either singleton profile URIs or pairs, such as used above. How would one distinguish between two different profiles, and a single profile pair?

That's still an open question, but something we plan to address indeed. They will be treated as different cases.

it seems that there might be overlapping uses for Profile. which might allow negotiating for a vocabulary variant (say FOAF vs. schema.org), in addition to the frame or context.

Indeed, and not all MIME types need to support this.

gkellogg commented 6 years ago

Deferred to WG due to https://json-ld.org/minutes/2018-04-10/#resolution-3.

gkellogg commented 6 years ago

Closed in favor of https://github.com/w3c/json-ld-syntax/issues/8.