microformats / microformats2-parsing

For collecting and handling issues with the microformats2 parsing specification: http://microformats.org/wiki/microformats2-parsing
14 stars 6 forks source link

Define an algorithm to determine the primary mf2 object of a page #78

Open aaronpk opened 8 months ago

aaronpk commented 8 months ago

When looking at the parsed mf2 JSON of a page, it is not always trivial to determine which object is the "primary" object of the page. In some cases there is only one object, so that's usually the primary object. In other cases there might be multiple objects, (h-card first, h-entry second, where h-entry is the primary object), and when there are multiple objects the page might be a feed (no primary object) or might not be a feed.

I've already done quite a bit of work in XRay to determine what the primary object of a page is. There are lots of test cases included. However there are still edge cases that are not handled well by the current algorithm, so it would be useful to try to more formally document these, both to better cover the edge cases and so that others don't have to re-create the algorithm themselves as well.