Closed aaronpk closed 7 years ago
i believe this is related to this section from microformats wiki:
http://microformats.org/wiki/microformats-2 Quote:
FOR PARSERS ONLY:
Without a property class name like 'p-org' holding all the nested objects together, we need to introduce >another array for nested children (similar to the existing DOM element notion of children) of a >microformat that are not attached to a specific property:
Parsed JSON:
{ "items": [{ "type": ["h-card"], "properties": { "name": ["Mitchell Baker"], "url": ["http://blog.lizardwrangler.com/"] }, "children": [{ "type": ["h-card","h-org"], "properties": { "name": ["Mozilla Foundation"], "url": ["http://mozilla.org/"] }
}] }] }Since there's no property class name on the element with classes 'h-card' and 'h-org', the microformat representing that element is collected into the children array.
Such a nested microformat implies some relationship (containment, being related), but is not as useful as if the nested microformat was a specific property of its parent.
For this reason it's recommended that authors should not publish nested microformats without a property class name, and instead, when nesting microformats, authors should always specify a property class name (like 'p-org') on the same element as the root class name(s) of the nested microformat(s) (like 'h-card' and/or 'h-org').
which appears not yet implemented.
Any updates? I just had to do an ugly workaround for this, dropping down to use the to_hash version of the parsed data: https://github.com/aaronpk/webmention.io/commit/d2cc83613c2571cada747cd26010839a0841b7e5#diff-411ca3c70351e774091d525fab8264b9R330
Not as of yet, unfortunately
On Sat, Dec 13, 2014 at 9:29 AM, Aaron Parecki notifications@github.com wrote:
Any updates? I just had to do an ugly workaround for this, dropping down to use the to_hash version of the parsed data: aaronpk/webmention.io@ d2cc836#diff-411ca3c70351e774091d525fab8264b9R330 https://github.com/aaronpk/webmention.io/commit/d2cc83613c2571cada747cd26010839a0841b7e5#diff-411ca3c70351e774091d525fab8264b9R330
— Reply to this email directly or view it on GitHub https://github.com/G5/microformats2/issues/31#issuecomment-66883959.
Michael Mitchell SOFTWARE ENGINEER
[image: G5 Website] http://www.getg5.com/ DIGITAL EXPERIENCE MANAGEMENT www.GetG5.com http://www.getg5.com/ T 541.306.3374
FOLLOW US http://www.getg5.com/
https://plus.google.com/u/0/101198449642176712699/about
http://www.linkedin.com/company/getg5
https://twitter.com/G5Platform
https://www.facebook.com/GetG5
This email may contain information that is privileged, confidential, or proprietary and is intended solely for the named addressee. If you are not the addressee, or if it appears that you have received this email in error, please advise me immediately by reply email, do not disclose, copy, or distribute the contents, and immediately delete the message and any attachments from your system. Thank you.
I tried to parse http://tantek.com/ and it crashed the parser on something like this:
<div class="h-entry">
<p class="u-comment h-cite">
test .
</p>
</div>
with:
URI::InvalidURIError: bad URI(is not URI?): test .
It would be nice if the parser at least wouldn't crash.
@jeena It seems like the parser is pretty much abandoned by the G5 folks. We got paid to build it (when I was still there) as a part of a larger product. But if it does as much as G5 needs and they're otherwise too busy, it's not likely to get the attention it deserves.
If you're able and willing to submit a pull request with a patch, I could apply it for you.
Unfortunately, (like too many open source projects) this doesn't really have a maintainer anymore. 😕
Oh ok, that's sad, but understandable. Perhaps one could write that somewhere into the README so people know and perhaps someone will be able to take it over. I'm not sure I will be able to fix something like this but if then I will make a pull request.
@jeena @aaronpk I believe this is fixed in 3.0
. I just did a test and it looks good to me. Please upgrade and run your comparison too. Re-open this issue if necessary.
It seems the parser is not handling nested objects properly.
For example, this URL: http://aaronparecki.com/notes/2014/07/04/2/indiewebcamp-latergram
It appears the comment authors and comment URLs all show up under the main h-entry when in reality they should be under children of the main h-entry as their own h-cite objects.
Compare the result of the PHP parser