llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.04k stars 11.98k forks source link

Document which options are required to be enabled/disabled for conformance and strict conformance of a program to a given language standard #57987

Open hvdijk opened 2 years ago

hvdijk commented 2 years ago

Per the discussion at https://discourse.llvm.org/t/configure-script-breakage-with-the-new-werror-implicit-function-declaration/65213 with @AaronBallman, there is no way that anyone knows of to invoke Clang in C-conforming mode, the option documented at https://clang.llvm.org/c_status.html is said by him to not be intended to do so, does not do so, and what options (if any) do is not only not documented anywhere but also not known and no work on fixing this is planned. As such, I am filing a ticket to remove the false claim that "Clang implements the following published and upcoming ISO C standards:"

AaronBallman commented 2 years ago

As such, I am filing a ticket to remove the false claim that "Clang implements the following published and upcoming ISO C standards:"

This claim is not false and documentation will not be modified in that way.

However, the thrust of the request here is valid: it would be nice to tell users "if you want the most strictly conforming mode of the compiler, here's how you get it". For example, Clang does not enable -Wreserved-identifier by default, but definition of a reserved identifier is UB. Similarly, Clang enables some diagnostics to be an error by default (which can be downgraded to a warning), such as -Wint-conversion or -Watomic-access that are triggered on code which technically should be accepted by a conforming implementation if the code is never executed. However, such documentation is going to be extremely fragile and difficult to write with much precision. We have targets that do non-conforming things so we have to look at the matrix of language mode and target, further there are parts of the standard we cannot conform to because the standard does not make an allowance for a C compiler that is not tightly coupled to the C Standard Library implementation as Clang does. There's also the matter of "do DRs count towards conformance?" which is outside the scope of the standard to answer but is important nonethless because it impacts code portability (which is the only purpose to writing conforming code in the first place). I'm sure there are plenty of other considerations I'm not thinking of yet.

So while the request is easy to spell out, the execution of that request is likely to be labor intensive and require oversight to keep from bit rotting as diagnostic behavior changes.

hvdijk commented 2 years ago

This claim is not false

You said there is no known way to get Clang to try to act as a conforming C90 implementation. I cannot see this as compatible with the claim that "Clang implements all of the ISO 9899:1990 (C89) standard." One or the other has to be false.

AaronBallman commented 2 years ago

Clang implements the standard (modulo bugs, of course). That you need to pass different warning flags to get problematic code to compile does not impact conformance. There is no requirement that the default set of options for the implementation results in a conforming C or C++ compiler. There is no documentation that we provide that says the default set of options will give you a conforming C or C++ compiler. You are assuming that -std= will give you a mode that will accept all code and not diagnose anything unless it's dynamically reachable and that is not a valid assumption.

hvdijk commented 2 years ago

That isn't what I've said. I never said there is any requirement for Clang to be conforming in its default mode. I am only saying there is a requirement for Clang to be conforming in some mode, and if that mode exists at all, it is not a mode that you are aware of, it is not a mode that I am aware of, and I strongly suspect it is not a mode anyone is aware of.

You are assuming that -std= will give you a mode that will accept all code and not diagnose anything unless it's dynamically reachable and that is not a valid assumption.

This too is not what I am assuming and given how explicitly I have already stated that this is not true, I cannot see how you can think this. Clang is free to warn about any and all code in conforming mode, it's only not free to error on valid code in conforming mode. As for which flags are needed to get into a conforming mode, it does not matter what they are, so long as it is documented. I was expecting -std=* to be enough because the documentation strongly hints it's enough. We've established that it's not and that there is no documentation that will say what set of flags can be used instead.

If no one can get Clang to try to conform to C90, it does not conform to C90.

Note that all of this is only a horrible hassle for documentation because you decided that Clang's -std= and -pedantic(-errors) should not be GCC-compatible, despite the options being copied from GCC. If they were GCC-compatible, we would not be having this issue, we would be able to just say clang -std=c90 -pedantic will get you a conforming C90 implementation, just like gcc -std=c90 -pedantic will already get you a conforming C90 implementation as per its documentation.

(Edit: PS:

For example, Clang does not enable -Wreserved-identifier by default, but definition of a reserved identifier is UB.

The C standard does not require a diagnostic for UB. If a diagnostic is useful, it may be a worthwhile addition, but to leave it off does not affect conformance.)

AaronBallman commented 2 years ago

That isn't what I've said. I never said there is any requirement for Clang to be conforming in its default mode. I am only saying there is a requirement for Clang to be conforming in some mode, and if that mode exists at all, it is not a mode that you are aware of, it is not a mode that I am aware of, and I strongly suspect it is not a mode anyone is aware of.

Thank you for clarifying, sorry for my misunderstanding! Just because I don't know the full set of flags required to put the compiler into a conforming mode for a given language standard does not mean the set does not exist.

Clang is free to warn about any and all code in conforming mode, it's only not free to error on valid code in conforming mode.

As discussed plenty already, that is not correct.

The C standard does not require a diagnostic for UB. If a diagnostic is useful, it may be a worthwhile addition, but to leave it off does not affect conformance.)

This is also not correct. Clause 4 goes into details on this, but in C2x Clause 4, it says (p5): A strictly conforming program shall use only those features of the language and library specified in this document.3) It shall not produce output dependent on any unspecified, undefined, or implementation-defined behavior, and shall not exceed any minimum implementation limit.

p6: The two forms of conforming implementation are hosted and freestanding. A conforming hosted implementation shall accept any strictly conforming program. A conforming implementation may have extensions (including additional library functions), provided they do not alter the behavior of any strictly conforming program4).

p8: A conforming program is one that is acceptable to a conforming implementation.

Therefore, using something described as undefined behavior in the standard is not a conforming program unless we've defined the behavior for our implementation to be an extension.

C2x 6.4.2.1p7 and 8 go on to say: Some identifiers are reserved. — All identifiers that begin with a double underscore (_) or begin with an underscore () followed by an uppercase letter are reserved for any use, except those identifiers which are lexically identical to keywords78). — All identifiers that begin with an underscore are reserved for use as identifiers with file scope in both the ordinary and tag name spaces. Other identifiers may be reserved, see 7.1.3.

If the program declares or defines an identifier in a context in which it is reserved (other than as allowed by 7.1.4), the behavior is undefined.

So -Wreserved-identifiers is one such flag you'd need to enable to get full conformance checking mode. (The allowance in 7.1.4 is for defining identifiers from standard library headers without including the header, which means the user is allowed to declare [[noreturn]] void _Exit(int); without triggering undefined behavior.)

hvdijk commented 2 years ago

As discussed plenty already, that is not correct.

I thought you had already agreed that it was correct. There is code that we agree is valid C99, code that is rejected by clang -std=c99, where my understanding of your argument why that's okay is because clang -std=c99 is not meant to be conforming mode, and instead clang -std=c99 -Wno-a -Wno-b -Wno-c <...> for unknown a, b, c, ... should be used to get Clang to act in conforming mode. Am I misunderstanding you here?

You quote for the other point, "A conforming hosted implementation shall accept any strictly conforming program." clang -std=c99 does not accept this strictly conforming program, therefore it is not a conforming implementation.

This is also not correct.

It is. There is no dispute that a program that uses reserved identifiers is not strictly conforming, but the C standard distinguishes between syntax errors, constraint violations, and undefined behaviour. Syntax errors and constraint violations require a diagnostic from a conforming implementation, undefined behaviour does not.

5.1.1.3 Diagnostics

A conforming implementation shall produce at least one diagnostic message (identified in an implementation-defined manner) if a preprocessing translation unit or translation unit contains a violation of any syntax rule or constraint, even if the behavior is also explicitly specified as undefined or implementation-defined. Diagnostic messages need not be produced in other circumstances.

hvdijk commented 2 years ago

I realise now it's worse: the same analysis for the previously given program also applies to

struct A { int x; };
struct B { int x; };
void f(a) struct A a; {} /* never called */
void g(b) struct B b; { f(b); } /* never called */
int main() { return 0; }

This is a strictly conforming C99/C11/C17 program. (C90, possibly not based on what you quoted, though I've not yet checked to be sure.) GCC silently accepts it with -std=c90/c99/c11/c17 -pedantic-errors. MSVC silently accepts it too. Intel accepts it with the same options as GCC too, but issues a warning (only a warning, not an error) "warning #180: argument is incompatible with formal parameter". Clang, no options I can find get it to accept this program, in which case no amount of documentation changes are going to be sufficient by themselves.

llvmbot commented 1 year ago

@llvm/issue-subscribers-c