Closed bblfish closed 8 years ago
In response to @gobengo's remark in issue 9, here is an reason why just using URLs as attribute names does not get us very far. Attribute values in a string are not good enough to reconstruct the data structure.
Here is an example: imagine you have a form ask you for your name
, your age
, and a number of address fields where the list of attribute values would end up being for example:
street=19 rue Saint Honore
, city=Fontainebleau
country=France
zip=77300
name=Henry
age=47
to make it simple. ( If you want pingback to be extensible you could imagine pinging such a piece of information ). Having the attributes use the foaf or card ontology urls would not help the server reconstruct the data structure behind it which would be something like
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix contact: <http://www.w3.org/2000/10/swap/pim/contact#> .
[] foaf:name "Henry";
foaf:age 47;
contact:home
[ a contact:ContactLocation;
contact:address [ contact:city "Fontainebleau";
contact:country "France";
contact:postalCode "77300";
contact:street "19 rue Saint Honore"
] .
There is no way to know without extra context information how to go from such an attribute pair value to the above more complex structure. What is needed is exactly for a form with the attribute value pairs to return a way for a client to be able to work out what graph of information those would result in. We could use the same mechanism as the one described above. The client could make a request to the /address
resource
GET /address HTTP/1.0
but this time instead of a WebMention link "urlencoded" it could return the header
200 Ok
Link: <http://postal.org/Address>; rel="urlencoded"
where someone would have published at http://postal.org/Address
a doument that would allow full meaning to be deduced
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix contact: <http://www.w3.org/2000/10/swap/pim/contact#> .
CONSTRUCT {
[] foaf:name ?name;
foaf:age ?age;
contact:home
[ a contact:ContactLocation;
contact:address [ contact:city ?city;
contact:country ?country;
contact:postalCode ?zip;
contact:street ?street
] .
} WITH ?name ?age ?city ?country ?zip ?street
And we need a way there to be a way to specify that the age is an integer, so there is an extra datatype conversion syntax piece required.
@melvincarvalho Replying re: subject in https://github.com/w3c-social/webmention/issues/9#issuecomment-159966620 Think of posting a webmention like writing to an LDP container - when you post the subject hasn't been created yet, but the server then creates a URI for it and returns that to you. That's already in the spec. So the subject doesn't ever need to be a blank node, but the URI is generated by the receiver, not the sender.
Re: all of above about posting things other than source
and target
...
There is nothing that specifies in the URLEncoding specification that you are meant to think of things the way you want to interpret them @rhiaro . The example I posted above with the address is just meant to show that most forms create data structures that are much more complex than the one envisaged by the initial webmention protocol, so that the issue of the subject is just one small issue among many that need to be considered. How is the server meant to know which fields go together? Which fields are properties of an address and which one of the person? In the above case there are blank nodes between the person and the address, because houses can have more than 1 address ( I was at an apartment in Paris in which that was the case ). Vice versa: how is the client meant to know that the server receiving the properties is going to interpret the properties the way you wish them to interpret them?
For that to work you need the server to provide a mapping from a form to an explicit semantics of how these property values will be interpreted. Anything else is quite literally wishful thinking.
That is what this issue is considering.
As I remember in Paris F2F the IndieWeb folks had developed protocols with much more complex Forms than the webmention one discussed here. If we are to enable all of those, and many more that will follow, we need a general way to deal with all of them, or else we'll be inventing ad hoc interpretations for each different service.
Trying to follow all the threads going on from email notifications.
Perhaps the spec should make it clear that NO DATA other than addresses of location where the data can be fetched be provided unless some authentication is provided (leaving method of auth up to implantation) At that point you can include a single field for the encoding and a single field for a serialized form of data.
Basically Source=https://example.com/somesource Target=https://example.net/target Encoding=activity+json Data={@context: ......}
Sorry for the abbreviated format. On my phone.
So form encoding not be any data other than URLs, encoding value, and then serialized data which can handle it's own definition of what values mean.
@dissolve that does not answer the question as to how the client knows that the server will actually interpret what you send this way, or how a client that lands on such a resource could find out what to send it. Furthermore your answer seems again very ad-hoc.
Also, why not use a web browser to follow the discussions. It's a lot easier :-)
Re: "how the client knows that the server will actually interpret what you send this way" - the client just discovered the endpoint to post to. The server, in pointing to a webmention endpoint, is saying 'you can post source
and target
here according to the webmention spec'. Right..?
Re: "how a client that lands on such a resource could find out what to send it" - hmm, definitely an interesting problem, but I think perhaps out of scope. The spec is defining what to do for a client who is specifically looking to send a webmention (in which case they follow the endpoint discovery steps), not a client who is randomly crawling your site looking for places to post to. Whilst it would be cool if we could 'just' hypermedia all the things, I think this adds a level of complication that might hinder adoption for something that could otherwise be easy for people to pick up and implement on a whim.
@rhiaro you don't take into consideration
source
and target
seem to be very meaningful in a military context, just as much as in the webmention one)@bblfish Assume I'm a bit dense, and walk me through 1. a) I link to a site which unbeknownst to me has a malicious, fake webmention endpoint, and discover this, and post to it. b) I have posted to a form 2 URLs, which it accepts and processes in order to misuse. c) ..? Given that a malicious script can go and crawl for pages which link to other pages, if that's all it needs to do malicious things, I'm not sure what extra problems are being caused.
2 - Sure, but this spec deals with the instance where someone did reach my resource (webmention endpoint) through this specific route. One could additionally extend one's webmention endpoint with hypermedia controls (handwaving hypermedia terminology here, sorry if I screw that up) but that's not part of this spec.
@rhiaro
In any case I have put forward a simple proposal that would allow this case to be solved and an infinite number of others too in a way that is
@rhiaro
Think of posting a webmention like writing to an LDP container - when you post the subject hasn't been created yet, but the server then creates a URI for it and returns that to you. That's already in the spec. So the subject doesn't ever need to be a blank node, but the URI is generated by the receiver, not the sender.
@bblfish
There is nothing that specifies in the URLEncoding specification that you are meant to think of things the way you want to interpret them @rhiaro .
There's more the request than just the request body and content-type. The HTTP Post method is what implies what the body is for.
In any case I have put forward a simple proposal
I can't really understand what that is, but looks like it involves a 'urlencoded' link relation that I can't find any mention of anywhere else.
is extensible to more complex versions of pingback protocol
I sense an aversion from others to make a more complex version of the pingback protocol, and to keep the webmention 'core' as simple as it the current draft is. And I think I agree after the discussion in #3. More complex things (other properties, other mimetypes) can always be extensions once proven out.
provides a seamless path to make both IndieWeb folks and LDP people to work together the way they like with minimum disturbance.
I think we can do this just by recommending semantics for source/target as described in #9
@gobengo POST
does make it explicit that you are creating or altering a resource, but that does not provide enough information to tell you what the content of the message is, when combined with application/x-www-form-urlencoded
mime type. It is the mime type that usually helps make the interpretation of the content explicit, but urlencoding does not provide enough of it - in the non human readable web that is. ( In the html document web the form is produced by the same origin as the "endpoint" and the context is interpreted by very context sensitive agents called humans)
@sandhawke asked the following in issue 10
@bblfish how can a header possibly help, since the agent doing the POST won't see the header until after it's completed the POST.
As this is more relevant to this issue I'll post the answer here:
This does indeed require an initial request on the endpoint/container.
HEAD
first, and then a POST
POST
to a different origin, there is in any case the cost of the initial CORS preflight request.HEAD
or GET
is not necessary. The agent just takes an extra little risk by doing so.GET
or HEAD
could be amortized over a long period, as the response can be cached@sandhawke nobody in the LDP group was pushing application/x-www-form-urlencoded
communication, so this discussion certainly would not have come up. The IndieWeb group on the other hand are putting forward a number of protocols to use application/x-www-form-urlencded
. A number of them were put forward at the Paris Face to Face, all based on their WebMention experience. If there is to be interaction between LDP group and such protocols something like this is needed. Different forum different problems.
IWC's work on micropub for instance is essentially RDF/POST. The fundamental difference is that, IWC places its bet on the microformats vocabulary and hardcoding those terms for the parameter names. That particular approach is unfortunately bound to fail on longevity. Visible lesson: mf1->mf2. Lifetime of mf1 terms: ~5 years. Promotes regrettable HTML markup like for instance <p class="p-name entry-title e-content entry-content article">
which some prefers to maintain code bloat.
The IndieWebCamp has not yet taken the name space issue into consideration, so its not re-inventing RDF/POST. It is unlikely that #9 or RDF/POST will be acceptable to IndieWebCamp given that it is not easy for their developers to see the point of making forms so heavy weight. On the other hand the answer proposed here could bridge the gap elegantly, as it make no requirement on the developer to do anything very much out of the ordinary, other than adding a header to the endpoint, which I think will be needed for any machine readable service.
IWC has considered namespaces but decided against it. Baggage from microformat's namespaces considered harmful and handful of other cherry picked anti-patterns - essentially conclusion/agenda first, compiling only supporting evidence next by the vocal minority, in combination with wiki policing etc.. Having said that, if the IWC community is comfortable with that type of governance, and can solve their own problems without namespaces, all the power to them. That unfortunately makes it difficult to interop with the other approaches.
yes, that is why this proposal does not require IWC users to be preoccupied with namespaces. There is just the requirement for a Link Header, and they can work with the Semantic Web team where we can put what we need at the URL location to be able to automate server and clients.
@csarven I think interop is still easy, as long it's clear when one is at the boundary. For example, one might use the rel='webmention' as a boundary signpost. It's a little more code, but it's straightforward.
Just noting that @kevinmarks is echoing some of the points made here in issue 4. For those who don't like the Army example @melvincarvalho came up with an alternative one:
Donald wishes to say good night to his wife. He uses a webmention form to to his wife's endpoint pointing to the message "Good night, dear".
However Donald happens to be logged in to his work account which is connected to a drone system. By mistake the webmention is routed to a form which is used to target drone strikes. By adding "target" of his wife, the AI enabled drone system is able to deduce that Donald requires a strike to be carried out against the target.
If the software did what we all do before posting something, namely reading the page that requires us to click the button, then it would have found the required Link relation and it would have understood that that endpoint will not interpret its attribute values in the intended manner.
I'll additionally note that this is well undersood in web security, which is why there is something such as CORS preflight requests.
@bblfish what about a simpler solution. Simply add the parameter
type=Webmention
That should provide enough context for a processor to not get confused.
It would also play nicely with a linked data paradigm where you'd want to add rdfs : type = Webmention.
Does it scale to the web? Possibly not. But would it work in practice? Probably yes.
However, in a JSON formulation, which I think is the consensus in this group for passing messages around, I think it might work just fine. And also you then dont need the preflight.
Would that work, or did I miss something?
If it does not scale to the web, then it can't work in practice on the web. The reason CORS ads preflight requests on POST, PUTs and other non idempotent methods is for reasons that are not far from the issues being discussed here.
Closing this issue due to lack of clear issue/suggestion and no further interest in the past 3 months. Please open a new issue with a specific topic or suggestion if you would like to discuss further.
In Lack of context WebMention the problem of the meaning of URLEncoded forms as paramter/values is considered.
This may not be that difficult to do. We could define a new
Link:
relation sayurlencoded
that would point to a transformer from urlencoding to rdf. This would allow a client on making a request to a webmention endpointTo retrieve a result such as this (see Web Linking RFC), where of course the "urlencoded" relation needs to be described and registered correctly.
The document at
<http://w3c.org/social/WebMention>
would have both an HTML representation and a machine readable representation.What one really wants is the ability to also retrieve a machine readable document from
http://w3c.org/social/WebMention
that would describe the url encoded form. It would have some yet to be determined mime type (that is not html), and would return something like this:Where
?source
and?target
are the attribute names of the form. This would allow the WebMention enabled clients to continue sending the attribue value pairs as they do now,and would allow a robot to interpret that to be equivalent to the rdf graph written out in Turtle as
( clearly there is a piece of syntax still missing in the sketched language to turn the ?source and ?target strings into URLs) This is not that complicated and would allow us to de-siloeify all forms on the web.
This would allow the IndieWeb folk to increase the security of their protocol while retaining their principle of remaining accessible, and it would allow this to be integrated generically into the SoLiD platform, so as to reduce configuration mistakes, and make it easier to automatically create such resources. This would require from the LDPnext side to work out how one can increase the mime type to such a urlencoded form.