Unclear interaction between property-microformat collapsing and implied properties

JKingweb commented 1 year ago

The general parsing rules state:

if that child element itself has a microformat ("h-*" or backcompat roots) and is a property element, add it into the array of values for that property as a { } structure, add to that { } structure:

value:

if it's a p-* property element, use the first p-name of the h-* child

else if it's an e-* property element, re-use its { } structure with existing value: inside.

else if it's a u-* property element and the h-* child has a u-url, use the first such u-url

else use the parsed property value per p-*,u-*,dt-* parsing respectively

A strict reading excludes implied name and url (they are not p- or u- properties, technically) despite their being suitable values, such that the parent's name property here has a value of ABBA rather than C as the child does:

<div class="h-parent">
  <div class="p-name h-child">
    <div>
      A<abbr title="C">BB</abbr>A
    </div>
  </div>
</div>

Current parser behaviour:

C: PHP, JavaScript, Go, Rust, Haskell, Ruby ABBA: Python

gRegorLove commented 1 year ago

Since the parser has already recursed and parsed the child element at that point, I wonder if these lines should be changed to use the parsed properties from the child.

This line:

if it's a p- property element, use the first p-name of the h- child

Could become:

if it's a p- property element, use the parsed name property of the h- child

If the parsed name property is a { } structure, use its value property

Else use the first value in the name array

And so on for the other prefixes.

I think this is what php-mf2 does in practice. I wonder what the other parsers do.

A php-mf2 example with odd usage of e-name to demonstrate the above:

<div class="h-feed">
  <article class="p-x-articles h-entry">
    <h1 class="e-name"><b>Lorem ipsum</b></h1>
  </article>
</div>

"type": [
    "h-feed"
],
"properties": {
    "x-articles": [
        {
            "type": [
                "h-entry"
            ],
            "properties": {
                "name": [
                    {
                        "html": "<b>Lorem ipsum</b>",
                        "value": "Lorem ipsum"
                    }
                ]
            },
            "value": "Lorem ipsum"
        }
    ]
}

JKingweb commented 1 year ago

Since the parser has already recursed and parsed the child element at that point, I wonder if these lines should be changed to use the parsed properties from the child.

This line:

if it's a p- property element, use the first p-name of the h- child

Could become:

if it's a p- property element, use the parsed name property of the h- child

If the parsed name property is a { } structure, use its value property

Else use the first value in the name array

And so on for the other prefixes.

I think this is what php-mf2 does in practice.

This seems pretty sensible to me, though I think your text is incorrect. I suspect you meant something more like this:

if it's a p- property element and the element's microformat has at least one name property, use the first name property of the h- child as follows:

If the first name property is a { } structure, use its value property

Else use the first name property as parsed

The language is a bit tortured, unfortunately, but I think it expresses the spirit of your proposal accurately.

I wonder what the other parsers do.

Modifying the test so that it is instead:

<div class="h-feed">
  <article class="p-x-articles h-entry">
    Fall through <h1 class="e-name"><b>Lorem ipsum</b></h1>
  </article>
</div>

Go falls through to the "use regular p- processing" step, what I believe to be the correct behaviour per the current text
JavaScript, Rust, Haskell, and Ruby use the entire name structure of the child
Python transcludes the name structure into the child microformat so that it has both value and html keys as siblings of properties

microformats / microformats2-parsing

Unclear interaction between property-microformat collapsing and implied properties #66