microformats / microformats2-parsing

For collecting and handling issues with the microformats2 parsing specification: http://microformats.org/wiki/microformats2-parsing
14 stars 6 forks source link

Handling Non-Standard Formats #5

Closed AljoschaMeyer closed 2 years ago

AljoschaMeyer commented 8 years ago

The parsing specification is vague about handling non-standard formats, e.g. <div class="h-fwfkjwe">foo</div>. Include them in the json? Ignore? Warn? Error out?

The css-selectors used to describe the parsing seem to indicate that non-standard formats are valid:

:not[.h-*] is not a valid CSS selector but is used here to mean:
does not have any class names that start with "h-"

and

The "*" for root (and property) class names consists only of lowercase a-z and '-' characters.

The test suite however does not contain any examples for this and only checks well-known formats.

In any case, stating this more explicitly in the specification wouldn't hurt.

voxpelli commented 8 years ago

My understanding is that parsers should not care about which formats are "standard" and which aren't. That is of the main differences between Microformats 1 and 2 – that in Microformats 1 parsers needed to know about all standards to be able to parse whereas in Microformats 2 parsers can parse all standards without any specific knowledge of a specific one.

gRegorLove commented 8 years ago

There probably should be tests in the suite for only parsing lowercase-alpha; we recently added a fix and tests for that in the PHP parser, indieweb/php-mf2.

Outside of that, I think it's generally a feature that it parses anything matching the pattern. It allows experimental mf2 like p-x-whatever without updating the parsing spec.

jgarber623 commented 4 years ago

I believe this issue has been addressed as of microformats/tests#110 (which was merged to master today).