llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.8k stars 11.9k forks source link

Functions that forward printf format strings erroneously trigger -Wformat-security #64651

Open pkasting opened 1 year ago

pkasting commented 1 year ago

The following code triggers -Wformat-security on the first call to f2(), even though the fact that f22() is itself tagged with the format attribute should let the compiler realize that the first arg is a literal at the original callsite:

void __attribute__((format(printf, 1, 2))) f2(const char* format, ...);
void __attribute__((format(printf, 1, 2))) f22(const char* format, ...) {
    f2(format);     // Warning (incorrect)
    f2(format, 1);  // No warning (correct)
}

void bar() {
    f22("Test");
}

Godbolt repro: https://godbolt.org/z/qeqjP7dr7

Strangely, the second call to f2() does not produce a warning, though it is safe or unsafe in exactly the same cases as the first call.

zmodem commented 1 year ago

the fact that f22() is itself tagged with the format attribute should let the compiler realize that the first arg is a literal at the original callsite

The code (I believe this would live in checkFormatStringExpr) is not smart enough to do this, but maybe that could be arranged :-)

Strangely, the second call to f2() does not produce a warning, though it is safe or unsafe in exactly the same cases as the first call.

Interesting. That behavior comes from https://github.com/llvm/llvm-project/commit/cc5d1c2e4e8b8 which split the "no-arg" case into a separate warning.

$ cat -n /tmp/a.cc
     1  #include <stdio.h>
     2
     3  void f(const char* format) {
     4      printf(format);
     5      printf(format, 1);
     6  }

$ clang -c /tmp/a.cc -Wformat-nonliteral
/tmp/a.cc:4:12: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
    4 |     printf(format);
      |            ^~~~~~
/tmp/a.cc:4:12: note: treat the string as an argument to avoid this
    4 |     printf(format);
      |            ^
      |            "%s", 
/tmp/a.cc:5:12: warning: format string is not a string literal [-Wformat-nonliteral]
    5 |     printf(format, 1);
      |            ^~~~~~
2 warnings generated.

The first one has the "potentially insecure" note and is part of -Wformat-security which is on by default. The second warning is behind -Wformat-nonliteral which is disabled by default.

I don't know the reasoning behind this, but it seems to match GCC. Perhaps the idea is that when there are additional arguments, it's expected that the user is aware that the function is not just printing, but treating the first arg as a format string.

pkasting commented 1 year ago

the fact that f22() is itself tagged with the format attribute should let the compiler realize that the first arg is a literal at the original callsite

The code (I believe this would live in checkFormatStringExpr) is not smart enough to do this, but maybe that could be arranged :-)

Yes, I think transitively marking the arg as a literal here is a feature request; the current warning is "technically correct, but not what you wanted".

Strangely, the second call to f2() does not produce a warning, though it is safe or unsafe in exactly the same cases as the first call.

I don't know the reasoning behind this, but it seems to match GCC. Perhaps the idea is that when there are additional arguments, it's expected that the user is aware that the function is not just printing, but treating the first arg as a format string.

My request on this bug, at least, would be that neither case trigger either warning.

I think having two warnings for this is a bit suspicious. But either way, you do raise a good point that gcc doesn't do what I'm asking for either, and technically Clang documents the format attribute as a gcc attribute; so is there a general policy on "if we were to decide gcc had a behavior we didn't like, would we maintain compat"?

zmodem commented 1 year ago

The code (I believe this would live in checkFormatStringExpr) is not smart enough to do this

I was wrong: https://github.com/llvm/llvm-project/blob/llvmorg-18-init/clang/lib/Sema/SemaChecking.cpp#L9630

But it's picky, and in the original example it fails these checks: https://github.com/llvm/llvm-project/blob/llvmorg-18-init/clang/lib/Sema/SemaChecking.cpp#L9689 In this case because f22 takes it's format args as varargs, it expects the forwarding call to f2 to take them as a va_list argument, like:

#include <stdarg.h>

void __attribute__((format(printf, 1, 2))) f2(const char* format, va_list);

void __attribute__((format(printf, 1, 2))) f22(const char* format, ...) {
  va_list ap;
  va_start(ap, format);
  f2(format, ap);
  va_end(ap);
}

That last check was added fairly recently (92edd74b37c7a96b1d47dc67cda7f92b65066025).