microformats / php-mf2

php-mf2 is a pure, generic microformats-2 parser for PHP. It makes HTML as easy to consume as JSON.
Creative Commons Zero v1.0 Universal
194 stars 38 forks source link

Fix implied u-url when multiple links #110

Closed gRegorLove closed 7 years ago

gRegorLove commented 7 years ago

Noticed on https://indieweb.org/events that a u-url was being implied when it shouldn't have been — because there were two URLs.

<span class="h-event vevent" style="background:#FF9"><b><time class="dt-start dtstart">2017-05-04</time>…<time class="dt-end dtend" datetime="2017-05-05">05</time>: <span class="p-name summary"><a href="/2017/Bellingham" title="2017/Bellingham">IndieWebCamp Bellingham 2017</a></span></b> right before <a href="/LinuxFest_Northwest" title="LinuxFest Northwest">LinuxFest Northwest</a> in Bellingham, WA. First IWC Bellingham! </span>

http://microformats.org/wiki/microformats2-parsing##if+no+explicit+%22url%22+property

https://chat.indieweb.org/microformats/2017-03-11#t1489215954258000

gRegorLove commented 7 years ago

Upon closer inspection, this appears to be parsing correctly per the algorithm:

else if .h-x>a[href]:only-of-type:not[.h-*], then use that [href] for url

There is only one a[href] as a direct child of .h-x. The other a[href] is at .h-x > b > a[href].

If the </b> is moved to enclose both links, then no u-url is implied (confirmed in php-mf2, mf2py, and microformat-shiv parsers).

Similarly, all of those parsers get /Linuxfest_Northwest as the implied u-url. I don't think there's a parsing issue here; just a matter of fixing the source HTML / adding explicit u-url properties.

tantek commented 7 years ago

Indeed in cases like this, where it's a summary in a list of events, perhaps omit the "right before" link to something else, or put it after the close of the /span for the h-event.

gRegorLove commented 7 years ago

Ok, closing since not a parser bug.