validator / htmlparser

The Validator.nu HTML parser https://about.validator.nu/htmlparser/
Other
56 stars 26 forks source link

Don’t make COMMENT_LESSTHAN transition to itself #18

Closed sideshowbarker closed 4 years ago

sideshowbarker commented 4 years ago

This change drops the COMMENT_LESSTHAN state transition from the COMMENT_LESSTHAN itself — since we don’t need to have the state transitioning to itself.

The change also adds CPPONLY: MOZ_FALLTHROUGH in some appropriate places.

hsivonen commented 4 years ago

I thought the extra state transitions were there to account for the spec saying "reconsume in the comment state" but the code hoisting the reconsuming to the other states. (I can't recall why.) Does this still work without these hoisted transitions when the "reconsume" thing doesn't appear to happen?

hsivonen commented 4 years ago

Also, the fallthroughs here could use // CPPONLY: MOZ_FALLTHROUGH;

sideshowbarker commented 4 years ago

Does this still work without these hoisted transitions when the "reconsume" thing doesn't appear to happen?

Well one case it breaks is <!-- <-->. The change in this patch causes that to fail parsing as a comment. It results in “End of file inside comment”.

I thought the extra state transitions were there to account for the spec saying "reconsume in the comment state" but the code hoisting the reconsuming to the other states. (I can't recall why.)

Yeah. At this point, there’s a lot I don’t recall around this. But it seems clear that overall the existing code for it in the validator-nu branch is working as expected — so I will revert the removals I made, and get back to just your specific comment here:

https://github.com/validator/htmlparser/commit/29a2645b8d9b95f17151934f430bc5e9710912dc#r40291722

Were you asking specifically why it’s a state transition in the COMMENT_LESSTHAN state? Are you suggesting it should be removed just for that case?

hsivonen commented 4 years ago

Were you asking specifically why it’s a state transition in the COMMENT_LESSTHAN state? Are you suggesting it should be removed just for that case?

Yes. It seems that state transitions to itself, which generally doesn't need a call to transition.

sideshowbarker commented 4 years ago

Yes. It seems that state transitions to itself, which generally doesn't need a call to transition.

OK, I reverted the other removals and just kept the one removal from the COMMENT_LESSTHAN state.

Also, the fallthroughs here could use // CPPONLY: MOZ_FALLTHROUGH;

OK, added those.

So I git-amended the patch on the branch and force-pushed it here. Please re-review

hsivonen commented 4 years ago

Looks like I pushed this in the wrong place, but: https://github.com/validator/htmlparser/commit/ae8e1eea6cb8e8c679a52db80f5a63fd5f8a710f

sideshowbarker commented 4 years ago

Looks like I pushed this in the wrong place, but: ae8e1ee

Thanks, merged that into this PR branch — and will now merge over to the validator-nu branch