html5lib / html5lib-tests

Testsuite data for html5lib, including the de-facto standard HTML parsing tests.
MIT License
188 stars 61 forks source link

Strange expected behavior in the `foreign-fragment.dat` test #66

Closed inikulin closed 9 years ago

inikulin commented 9 years ago

Hi, thank you for all your great work.

I have an issue figuring out how the following test from the foreign-fragment.dat should work:

#data
</title>X
#errors
5: Stray end tag “title”.
#document-fragment
svg title
#document
| "X"

According to the fragment parsing algorithm (step 4) we should initially set tokenizer mode to the RCDATA state, since we have a <title> as the fragment context. Therefore, tokenizer will produce characters tokens for the </title>X, because end tag token will not be appropriate. So, to make test work we should start in the data state, but this will be possible if spec will have remark in fragment parsing algorithm (step 4) that element should be in HTML namespace. But it hash't such remark.

Is there problem with the test, spec or it it's just me missing something? :disappointed:

gsnedders commented 9 years ago

It's you missing something. :)

As you correctly note, the test is right if the fragment parsing algorithm only means the element in the HTML namespace. The cross-ref for title takes you to HTML title element (which is obviously in the HTML namespace). See also in the terminology section:

Except where otherwise stated, all elements defined or mentioned in this specification are in the HTML namespace ("http://www.w3.org/1999/xhtml"), and all attributes defined or mentioned in this specification have no namespace.

Perhaps we should change the spec so it explicitly says "HTML element" everywhere, but that seems like it'll make it harder to read because it'll be repeated everywhere? Hard to tell what the right solution is.

inikulin commented 9 years ago

I've had the feeling that there is such terminology remark. But I've expected it to be in cross-ref for the Element, but it just led me to the interface list.

Thank you!

inikulin commented 9 years ago

However, you are right: it's obviously leads to the HTMLTitleElement.It's just fine as is.