Open ContingencyOfTautologicalContradictions opened 1 year ago
This sounds like an issue with the spec, rather than the glslang compiler so I transferred it to the appropriate repository for that sort of issue.
I'm not sure what ambiguity you're aiming to clear up here, perhaps because I'm not sufficiently knowledgeable about UTF-8. Is there an alternative way of interpreting a UTF-8 sequence other than what you describe? I'm fine with spelling things out clearly, but this seems to be straying into territory that should be covered by the UTF-8 spec, rather than GLSL.
One specific concern that I have, for example, is that the proposed text talks about mapping the UTF-8 characters into the character set but doesn't say what the mapping is. I think that the UTF-8 codepoints actually already represent the characters, so don't need mapping, which is why the correct mapping is obvious, but if they're different enough to require mapping then we should say what the mapping is.
I'm not convinced that the handling of new lines in the proposed text is correct according to the current spec. GLSL currently says that any of "\r", "\n" or "\r\n" are a valid line break, which isn't the same as in your comment. I'm not sure what glslang implements for this.
It looks like glslang currently treats "\n" or "\r\n" as line terminators, the situation with bare "\r" is more complicated in that I think it will not produce syntax errors but also will not give the right numbers. Note that the spec actually limits the valid characters in GLSL tokens to (a subset of) ASCII and the core language does not have strings. The GLSL_EXT_debug_printf
extension does add string literals but the extension spec language still does not allow the use of codepoints above 126 in tokens, so the only place where non-ASCII characters can occur is in comments, where the current spec allows allows any byte values and doesn't require well-formed UTF-8. In practice, glslang doesn't enforce this and just accepts any sequence of bytes in a string literal (or in a header name in a #include
, another place where arbitrary strings are allowed).
At the GLSL 4.6 specification, add the following paragraph to the 3.1 section: