A joint Study Group between the C (WG14) and C++ (WG21) Committees to ensure the longterm synchronization and cooperation of the C and C++ programming languages where any mutual interests lie.
1
stars
0
forks
source link
CWG #2694: string-literals of the _Pragma operator #29
A unary operator expression of the form:
_Pragma ( string-literal )
is processed as follows: The string-literal is destringized by deleting the L prefix, if present, deleting the leading and trailing double-quotes, replacing each escape sequence \" by a double-quote, and replacing each escape sequence \ by a single backslash. The resulting sequence of characters is processed through translation phase 3 to produce preprocessing tokens that are executed as if they were the pp-tokens in a pragma directive. The original four preprocessing tokens in the unary operator expression are removed.
In contrast, C23 section 6.10.6 specifies:
A unary operator expression of the form:
_Pragma ( string-literal )
is processed as follows: The string literal is destringized by deleting any encoding prefix, ...
While in C, any encoding prefix is deleted, C++ only deletes the L prefix and does not consider UTF-8, UTF-16, and UTF-32 string literals. This was probably an oversight when C++ obtained UTF-x string literals.
However, a string-literal entails lexing of escape sequences and universal-character-names, which seems not useful given that the string literal is de-stringized immediately afterwards. It might be more appropriate to employ a simpler lexical structure such as a q-char-sequence as used in a header-name instead.
Notify WG14 of CWG #2694
Subclause 15.12 [cpp.pragma.op] paragraph 1 specifies:
A unary operator expression of the form: _Pragma ( string-literal ) is processed as follows: The string-literal is destringized by deleting the L prefix, if present, deleting the leading and trailing double-quotes, replacing each escape sequence \" by a double-quote, and replacing each escape sequence \ by a single backslash. The resulting sequence of characters is processed through translation phase 3 to produce preprocessing tokens that are executed as if they were the pp-tokens in a pragma directive. The original four preprocessing tokens in the unary operator expression are removed. In contrast, C23 section 6.10.6 specifies:
A unary operator expression of the form: _Pragma ( string-literal ) is processed as follows: The string literal is destringized by deleting any encoding prefix, ... While in C, any encoding prefix is deleted, C++ only deletes the L prefix and does not consider UTF-8, UTF-16, and UTF-32 string literals. This was probably an oversight when C++ obtained UTF-x string literals.
However, a string-literal entails lexing of escape sequences and universal-character-names, which seems not useful given that the string literal is de-stringized immediately afterwards. It might be more appropriate to employ a simpler lexical structure such as a q-char-sequence as used in a header-name instead.