Open rcseacord opened 1 year ago
Thanks for reporting this. We didn't implement this rule because:
-fno-digraphs
flag specified when using clang (further explanation below).gcc
was not relevant.We are, however, looking to expand our compiler support to gcc-like compilers, so we may have to consider what we could do for this rule. It would be a low priority feature request for the CodeQL C++ team to add native support for detecting digraphs, so we may have to look at workarounds, such as a lexical analyzer.
I haven't tested this, but presumably disable means that it stops converting them into the corresponding characters. However, to be compliant with the rule we need to diagnose these sequences of characters, even if they are not translated.
Digraphs behave differently from trigraphs, in that they are alternative tokens rather than straight character replacements. Trigraphs get replaced in the very first phase of translation, by matching character sequences. In comparison, digraphs are only considered when the tokenization of operators and punctuators occurs. Notably, this is after string literals and comments have been processed (see lex.phase
) so digraphs can never occur in comments or string literals, and as tokens they can only appear in valid places in the grammar.
My interpretation of this rule is therefore that the sequences specified are only "digraphs" if they appear in place where they would be tokenized as such. With -fno-digraphs
specified, the digraphs are no longer tokenized, which, I believe, will always lead to a program with digraphs failing to compile (counter examples welcome).
Has any additional consideration been given to implementing this rule?
Affected rules
A2-5-2
Description
The checker for "Rule A2-5-2 (required, implementation, automated) Digraphs shall not be used." was not implemented. Presumably, the expectation was that compiler flags would be sufficient. However, this is not the case.
Clang has the following flag:
-fno-digraphs
Disables alternative token representations
<:
,:>
,<%
,%>
,%:
,%:%:
(default)I haven't tested this, but presumably disable means that it stops converting them into the corresponding characters. However, to be compliant with the rule we need to diagnose these sequences of characters, even if they are not translated.
GCC is much worse, as they have no checker at all so there is no way to enforce this rule if you are using this compiler.