I'm finishing up testing of a tokenizer implementation, which I've tried to make pass the test suite in its entirety, including line and column positions for parse errors.
While supplementary-plane characters counting for two columns has some logic to it, I find myself struggling to understand the logic behind the positions in the tests added by @hsivonen:
Other EOF errors (and indeed, other eof-in-comment errors) use the position of the EOF itself, while these use a position one character behind. The position is also on line 1 rather than line 2 despite there being a line break just before.
I'm finishing up testing of a tokenizer implementation, which I've tried to make pass the test suite in its entirety, including line and column positions for parse errors.
While supplementary-plane characters counting for two columns has some logic to it, I find myself struggling to understand the logic behind the positions in the tests added by @hsivonen:
https://github.com/html5lib/html5lib-tests/pull/121/files
Other EOF errors (and indeed, other eof-in-comment errors) use the position of the EOF itself, while these use a position one character behind. The position is also on line 1 rather than line 2 despite there being a line break just before.
Is this simply an oversight?