Open jgarber623 opened 4 years ago
I think the current behavior listed above results in a more consistent result for consumers, with html
and value
appearing in a consistent location and value
always being a string.
So, it turns out that because this isn't included in the test suite, I managed to skip that line in the specification.
I don't want to get too involved in what the values should be (I would like to know though!), but a couple of comments:
html
property - this is described no-where in the specification so cannot be expected to be there.Take the markup:
<div class="h-entry">
<img class="u-photo h-card" alt="My name" src="/photo.jpg">
</div>
Looking at the specification:
else if it's a u- property element and the h- child has a u-url, use the first such u-url
The photo above doesn't have a url
property, so it falls back to the photo
property from:
else use the parsed property value per p-,u-,dt-* parsing respectively
As it has no nested u-photo
, it becomes an implied photo, whose value comes from:
if img.h-x[src], then use the result of "parse an img element for src and alt" (see Sec.1.5) for photo
Which means it should be: { value: "...", alt: "..." }
. This then becomes the complete value
of the h-card
based on the above specification.
Expected output
{
"type": ["h-entry"],
"properties": {
"photo": [
{
"type": ["h-card"],
"properties": {
"name": ["My name"],
"photo": [
{ "alt": "My name", "value": "http://example.com/photo.jpg" }
]
},
"value": {
"alt": "My name",
"value": "http://example.com/photo.jpg"
}
}
]
}
}
Here, the PHP parse at microformats.io doesn't parse the alt
at all at any level here, I believe incorrectly, so I've omitted it's output.
Again, the contents of value
would no-longer be a string. How should these be handled?
The way I've decided to interpret this is to take the value
out of the nested property.
The root element will now have a html property - this is described no-where in the specification so cannot be expected to be there.
I'm not sure I understand this part. What do you mean by root element? I would expect the parsed content
property to have an html
property in both cases.
In the common e-content
example:
<div class="h-entry">
<div class="e-content"><p>This is the content</p></div>
</div>
The parsed result is:
"items": [
{
"type": [
"h-entry"
],
"properties": {
"content": [
{
"html": "<p>This is the content</p>",
"value": "This is the content"
}
]
}
}
]
Adding a nested h-card:
<div class="h-entry">
<div class="e-content h-card"><p>This is the content</p></div>
</div>
I would expect the parse to be:
"items": [
{
"type": [
"h-entry"
],
"properties": {
"content": [
{
"type": [
"h-card"
],
"properties": {
"name": [
"This is the content"
]
},
"html": "<p>This is the content</p>",
"value": "This is the content"
}
]
}
}
]
Images are a special case where if there's an alt, the parsed result will be an object, otherwise a string. (alt
parsing is in php-mf2 master branch and hopefully will be in a new release soon.)
Following up on a conversation I started in chat today, I'd like to clarify a section in the parsing spec related to generating output for parsed elements containing both property class and root class names.
The wording from section 1.2 of the parsing spec (emphasis added):
The test suite includes test cases for
p-*
andu-*
(see microformats-v2/h-entry/impliedvalue-nested.html, for instance) properties, but I couldn't find a test case against ane-*
property whose element also had a root class name.I interpret "re-use its
{ }
structure with existingvalue:
to mean that the nested item'svalue
should be set to the hash structure. That would result in something like:Current Behavior
Using a contrived markup example like:
…parsers currently output results like:
Expected Behavior
Using the same markup example, and by my interpretation of the specification, I'd expect output like:
Proposals?
Which of the above is a correct interpretation of the spec? Existing evidence from parsers and the non-authoritative microformats2-json wiki page point to those being the correct interpretation despite the unclear wording in the spec.
Is that the consensus of the community? If so, we should find a way to re-word the spec. If not, we should find a way to re-word the spec.
Thanks for reading! Looking forward to feedback.