llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.27k stars 11.68k forks source link

[clang] Disable POSIX builtin library functions in MSVC mode #58783

Open zufuliu opened 1 year ago

zufuliu commented 1 year ago

Filled as a follow up issue for issue #56607.

clang-cl fails for following test C code:

int index;

int get(void) {
    return index;
}
D:\>clang-cl /c test.c
test.c(1,5): error: redefinition of 'index' as different kind of symbol
int index;
    ^
test.c(1,5): note: unguarded header; consider using #ifdef guards or #pragma once
test.c(1,5): note: previous definition is here
test.c(4,9): error: incompatible pointer to integer conversion returning 'char *(const char *, int)' from a function
      with result type 'int' [-Wint-conversion]
        return index;
               ^~~~~
2 errors generated.

I think this needs to be disabled in MSVC mode (clang-cl driver or --target=<arch>-pc-windows-msvc), as they are not supported by cl.exe and the C runtime library.

CC @AaronBallman, @rnk, @zmodem

efriedma-quic commented 1 year ago

MSVC has a bunch of "posix" functions; you'd have to go through function by function to figure out which ones it actually has.

We should consider downgrading this error to a warning, like we do for incompatible function declarations of builtins. (Or even if we keep it an error, we should consider a dedicated diagnostic; the current diagnostic is confusing.)

AaronBallman commented 1 year ago

Agreed that the current diagnostic text is very confusing.

We should consider downgrading this error to a warning, like we do for incompatible function declarations of builtins.

I'm wondering whether we should consider removing this error and instead rely on the incompatible function declarations to diagnose only when the user is trying for a function declaration. e.g.,

// Assume no headers have been included.
int strcpy; // No diagnostic, happily accepted
struct strcmp { // No diagnostic, happily accepted
  float sin; // No diagnostic, happily accepted
};
void index(const char *str); // Incompatible function declaration warning

My thinking is: if the user hasn't included any headers, we shouldn't have builtin library functions that conflict with user identifiers unless it looks like a function declaration, even though the standard makes (some of) these reserved identifiers. I'm a bit worried that the C++23 feature adding a ton of constexpr support to math interfaces will mean we have an explosion in library builtins that might conflict with identifiers from the user.

That said, perhaps there are good reasons why we want to diagnose here that I'm not thinking of yet.

efriedma-quic commented 1 year ago

Reasons to diagnose:

  1. It's formally undefined behavior. ("All identifiers with external linkage in any of the following subclauses (including the future library directions) and errno are always reserved for use as identifiers with external linkage.")
  2. The C library could try to call the variable (if it isn't specifically designed to defend against this sort of thing).
  3. The compiler implicitly generate a call to the variable. (The set of non-underscore-prefixed functions the compiler calls isn't that large, but it's currently not zero.)
  4. Other code in the program could try to call the variable.
  5. If any of 2-4 happens, the linker currently won't catch that you're doing it, so it just segfaults at runtime.

None of these reasons are likely to be relevant for "index" specifically, but could be relevant for more commonly used names. If your program declares an "int memcpy;" anywhere, it will almost certainly crash.

AaronBallman commented 1 year ago

It's formally undefined behavior. ("All identifiers with external linkage in any of the following subclauses (including the future library directions) and errno are always reserved for use as identifiers with external linkage.")

Agreed that it's formally UB (on the assumption that POSIX has similar wording for reserved identifiers given that index is not a C standard library function), and we need to update -Wreserved-identifier for that bit.

The C library could try to call the variable (if it isn't specifically designed to defend against this sort of thing).

Ohhhh, if the C library is using weak symbols that are resolved at runtime, the symbol from the user's program could theoretically be what's loaded...

If any of 2-4 happens, the linker currently won't catch that you're doing it, so it just segfaults at runtime.

Yeah, agreed that we shouldn't be silent here and at least need a warning. Thanks for the discussion!

In terms of error vs warning... I think an error is awfully mean for identifiers like index given that GNU-compatible mode is our default, but at the same time, if we had a really good error diagnostic that helps the user understand what's wrong, that could go a long ways towards addressing my concerns with the current behavior. My issue with the way things are now is: how does anyone but a language lawyer know what is going on with that failure?

zufuliu commented 1 year ago

It seems MSVC only reserves a few identifiers that starts with underscore (compiler builtins). for following code:

int strlen; // none
int _BitScanForward; // both
int __builtin_strlen;// C++
D:\>cl /c /TC /Wall test.c
Microsoft (R) C/C++ Optimizing Compiler Version 19.33.31630 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

test.c
test.c(2): error C2365: '_BitScanForward': redefinition; previous definition was 'function'

D:\>cl /c /TP /Wall test.c
Microsoft (R) C/C++ Optimizing Compiler Version 19.33.31630 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

test.c
test.c(2): error C2365: '_BitScanForward': redefinition; previous definition was 'function'
predefined C++ types (compiler internal)(290): note: see declaration of '_BitScanForward'
test.c(3): error C2365: '__builtin_strlen': redefinition; previous definition was 'function'
predefined C++ types (compiler internal)(275): note: see declaration of '__builtin_strlen'
D:\>clang-cl /c /TC /Wall test.c
test.c(1,5): error: redefinition of 'strlen' as different kind of symbol
int strlen; // none
    ^
test.c(1,5): note: unguarded header; consider using #ifdef guards or #pragma once
test.c(1,5): note: previous definition is here
test.c(2,5): error: redefinition of '_BitScanForward' as different kind of symbol
int _BitScanForward; // both
    ^
test.c(2,5): note: unguarded header; consider using #ifdef guards or #pragma once
test.c(2,5): note: previous definition is here
test.c(2,5): warning: identifier '_BitScanForward' is reserved because it starts with '_' followed by a capital letter
      [-Wreserved-identifier]
int _BitScanForward; // both
    ^
test.c(3,5): error: redefinition of '__builtin_strlen' as different kind of symbol
int __builtin_strlen;// C++
    ^
test.c(3,5): note: unguarded header; consider using #ifdef guards or #pragma once
test.c(3,5): note: previous definition is here
test.c(3,5): warning: identifier '__builtin_strlen' is reserved because it starts with '__' [-Wreserved-identifier]
int __builtin_strlen;// C++
    ^
2 warnings and 3 errors generated.