Open llvmbot opened 5 years ago
@Tim: Per the standard,
Quote 1: "A source file that is not empty shall end in a new-line character, which shall not be immediately preceded by a backslash character before any such splicing takes place."
By construction of the standard, i.e. by the statement above, a source file must have that property. If it does not, it is not a source file, as a source file shall have that property.
Second point, that "shall" is in fact a requirement on the "implementation", which chapter 5 of the standard defines as:
"An implementation translates C source files..."
Furthermore, the clause I cited(Quote1) above which we are not complying with falls under "The precedence among the syntax rules of translation is specified by the following phases:", hence it definitely is a "shall" that the compiler/implementation enforces if we were to follow the standard.
As for making it a benign warning, as a matter of being pragmatic it makes sense. However, if we are to ignore it in the implementation I wonder why it is a requirement in the standard. If we are not enforcing it, why not remove it, or relax the condition.
@hstong: I just checked:
00000000 69 6e 74 20 6d 61 69 6e 28 29 7b 20 72 65 74 75 |int main(){ retu| 00000010 72 6e 20 30 3b 7d 0a 5c 0a |rn 0;}..| 00000019
The above produces the warning in gcc, and compiles fine in clang, as I described earlier.
00000000 69 6e 74 20 6d 61 69 6e 28 29 7b 20 72 65 74 75 |int main(){ retu| 00000010 72 6e 20 30 3b 7d 0a 5c 0a 0a |rn 0;}...| 0000001a
Compiles fine on both.
@Richard: thank you for checking.
I can reproduce this. Here's what I see: a source file ending in:
The second case is an accepts-invalid bug.
I would suggest actually inspecting the file to determine if there is a newline character at the end of the file that is not preceded by backslash.
The following version of elaborate_case.c reproduces the behaviour you observed.
od -A x -t x1 <elaborate_case.c 000000 69 6e 74 20 6d 61 69 6e 28 29 7b 20 72 65 74 75 000010 72 6e 20 30 3b 20 7d 0a 5c 0a 0a 00001b Return: 0x00:0
Notice that there is a newline (0x0A) character not preceded by a backslash (0x5C) at the end of the file.
It would still be a source file officially (just by virtue of ending up as input to the compiler, I think) but the C++ standard would call it "ill-formed". C uses less sophisticated terminology.
Either way, we probably should diagnose it to be helpful but it's not a strict requirement (that "shall" is a requirement on the user, not the compiler). And we'd probably make it a warning by default (like GCC) rather than an error because it's pretty benign.
@llvm/issue-subscribers-c11
@llvm/issue-subscribers-clang-frontend
Author: None (llvmbot)
@llvm/issue-subscribers-c17
Author: None (llvmbot)
Extended Description
This report is a technical point more than anything. Both the C90 and C18 standards have a "conformance chapter", chapter 4 on C18, that states the following on section 1:
"In this International Standard, “shall” is to be interpreted as a requirement on an implementation or on a program; conversely, “shall not” is to be interpreted as a prohibition"
The technical point is the following, in section 5.1.1.2 in C18 , there is a similar section in C90, phase 2 states the following:
"Each instance of a backslash character () immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines. Only the last backslash on any physical source line shall be eligible for being part of such a splice. A source file that is not empty shall end in a new-line character, which shall not be immediately preceded by a backslash character before any such splicing takes place."
Consider the following program:
simple_case.c
clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final) Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/bin
gcc (Ubuntu 8.3.0-6ubuntu1~18.04) 8.3.0 Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
The clang version above compiles it just fine, and the gcc version above issues the following warning:
"c.c:2:1: warning: backslash-newline at end of file \
"
If instead we have:
elaborate_case.c " int main(){ return 0; } \
"
then they both compile fine without warning.
Question: according to the standard "A source file that is not empty shall end in a new-line character, which shall not be immediately preceded by a backslash character before any such splicing takes place.", are we not required by the standard to treat such a text file, the elaborate_case.c, as not a "source file"? That is, it should not compile.