Open jgarber623 opened 4 years ago
Here's the previous discussion and resolution.
My reading of the current spec is that this is correct for rel-urls:
{
"items": [],
"rels": {
"me": ["https://sixtwothree.org"],
"home": ["https://sixtwothree.org"]
},
"rel-urls": {
"https://sixtwothree.org": {
"text": "Jason Garber",
"rels": ["home", "me"]
}
}
}
Parsing the first link adds me
to the rels; parsing the second adds the text
property; parsing the third adds home
to the rels.
Edit Just noticed that this does lose the text value of the third link since that's already set by the second one. Hm.
Tagging @kevinmarks and @sknebel on this one.
Building on something Kevin mentioned in chat, say you're viewing a blog post in a Web browser and the page advertises alternate versions available at the same URL but with responses dictated by the incoming request's Accept
header:
<link rel="alternate" href="https://sixtwothree.org/posts/877-days" type="application/json">
<link rel="alternate" href="https://sixtwothree.org/posts/877-days" type="text/markdown">
The above example is a modified version of some markup I have on my own website. curl
-able by issuing the following commands:
curl -H 'Accept: application/json' https://sixtwothree.org/posts/877-days
curl -H 'Accept: text/markdown' https://sixtwothree.org/posts/877-days
With the aforementioned parsers on microformats.io, you'd miss out on the text/markdown
alternate version because the types
key in the rel-urls
structure is a simple string, not an aggregate array of matched values.
The same would be true of hreflang
, media
, etc. but the use case for that data is a little less obvious to me.
@jgarber623 thanks for raising this. I too found this ambiguous while implementing a parser.
The output of https://aimee-gm.github.io/microformats-parser/ (a JavaScript parser) is:
{
"rels": {
"me": [
"https://sixtwothree.org"
],
"home": [
"https://sixtwothree.org"
]
},
"rel-urls": {
"https://sixtwothree.org": {
"rels": [
"me",
"home"
],
"text": ""
}
},
"items": []
}
I also agree with @gRegorLove that this should have a non-empty string text value.
Section 1.4 of the microformats2 parsing specification outlines how to parse link elements (
<a>
,<link>
, etc.) forrel
values and defines the JSON output structure.The
rels
structure is reasonably straightforward and maps one-to-one with matched elements:…results in…
The parsing rules break down slightly when compiling results for the
rel-urls
structure. For each unique URL, the resulting JSON hash should include a keyrels
whose value is an array of strings found across matched link elements. The spec also defines rules for parsing various attributes (hreflang
,media
,title
, andtype
) and the node's text value. These extended attributes are specified as strings (not arrays), resulting in data loss and a seemingly inconsistent parsing pattern.Parser Results
Parser developers have implemented this feature with differing results.
Given the markup:
…the parsers provide differing result JSON.
Go
PHP
Python
Ruby
Note: The Node parser on microformats.io appears to be offline.
So…
The test suite's
rel
tests appear to conform to the spec as its written today. What I'd like help sorting out is what seems like an arbitrary (or, at least undocumented) decision to only aggregaterel
attribute values in therel-urls
result structure. The extended attributes are, per the spec, worth capturing, but not worth capturing as arrays. That seems strange.Can someone shed some light on the subject and/or can we update the spec to be more clear or to change behavior?
Edit 1: #39 is tangentially related to this, as well.
Edit 2: #32 is also related to this.