Closed nico closed 4 years ago
-Wmisleading-indentation will be in clang 10.0. Thanks, Tyker!
It's happening: https://reviews.llvm.org/D70638 \o/
These days, clang-tidy has a check that seems to at least partially overlap what we're talking about here:
https://clang.llvm.org/extra/clang-tidy/checks/readability-misleading-indentation.html
The 4th attachment is the output of my old wip patch on Chromium's "ui" target.
It flags this indent: https://chromium.googlesource.com/chromium/src/+/71c71a677ae0a8e0004a561f6d558b33ae9c1af3/third_party/libpng/pngread.c#122 (which is correct in that the indent is off, but it's not a behavioral bug; and it's fixed in libpng trunk)
The same is true here: https://chromium.googlesource.com/chromium/src/+/71c71a677ae0a8e0004a561f6d558b33ae9c1af3/third_party/libpng/pngpread.c#1104
https://chromium.googlesource.com/chromium/src/+/71c71a677ae0a8e0004a561f6d558b33ae9c1af3/third_party/libpng/pngwutil.c#375 kind of looks like a real (but also benign) bug:
if (comp->max_output_ptr != 0) png_free(png_ptr, comp->output_ptr); comp->output_ptr=NULL;
This code looks pretty different at trunk libpng, so it might be interesting to see what the motivation for the change was.
nspr is just weird indent again: https://github.com/servo/nspr/blob/master/pr/src/misc/prsystem.c#L106
https://chromium.googlesource.com/chromium/deps/icu.git/+/4b5a7240d2dfa0cb5627f20a26ceffd67930d134/source/i18n/rbnf.cpp#325 is arguably a bug in the warning approach. But it looks like ICU decided that formatting was too clever in https://chromium.googlesource.com/chromium/deps/icu.git/+/b31896655a701874d13e70cc24bca95e3e66a991%5E%21/source/i18n/rbnf.cpp
https://chromium.googlesource.com/chromium/deps/icu.git/+/0f8746ae6655c753514c329dfbe044e00e36a1b3/source/i18n/dtfmtsym.cpp#912 is the "incorrect indent, but harmless, but still good to fix" thing again.
...and so on. If you're interested in looking at this, I could re-run the check and post what it finds in Chromium. (If you're coming to LLVM conf, we could co-hack on this.)
The approach in described in comment 0 only does actual work after "few" AST nodes, so maybe it's not super expensive. It does need to track indent more often though. I don't remember if I had run any benchmarks.
Any examples of true positives other than famous goto fail;?
GCC tracks a lot of line data with -Wmisleading-indentation so probably this can slow down a compiler a lot. But we could do it heuristic-based based on typical true positives -> faster, less false positives.
GCC 6 will have this feature (-Wmisleading-indentation), would be nice if clang did too...
I just did an svn up and it resulted in a conflict:
unsigned getColumnNumber(FileID FID, unsigned FilePos, bool Invalid = nullptr) const; unsigned getSpellingColumnNumber(SourceLocation Loc, bool Invalid = nullptr) const;
.r208340
So you might want to update your patch Nico.
IMHO, having this committed would be great. :)
So I've turned on -Windent for some of my open source buildbots:
GDCM: http://open.cdash.org/viewBuildError.php?type=1&buildid=3278038 3 warnings: 2 definitely right to warn, 1 false positive I guess (but can easily add braces):
if( mult == 2 && genidx == -1 )
for(int i = 0; i < Internal->NumberOfPoints; i++ )
array[3*i+1] = p[i + 1];
else if( mult == 2 && genidx == 0 ) //**** warns here
for(int i = 0; i < Internal->NumberOfPoints; i++ )
array[3*i+1] = p[i + 0];
else
for(int i = 0; i < Internal->NumberOfPoints; i++ )
array[3*i+1] = 0;
VTK: http://open.cdash.org/viewBuildError.php?type=1&buildid=3277751 3 warning: all right to warn on
CMake: no warnings
I haven't found a great way to deal with the preprocessor issue described in comment 18. One possible avenue would be to just suppress the warning if the indented line is more than 3 lines below the end of the if/for/while.
(Why 3? because
stuff
is 3 lines long)
The obvious disadvantage of this is that it might skip valid statements due to comment lines.
I can confirm that 'prototype patch v2.3' applies against current trunk, builds, and warns in trivial cases... now to try on some code I use...
prototype patch v2.3 rebased
I'd like to give this a try, but it seems the latest patch no longer applies against trunk… think you could update the patch? Thanks.
Re majnemer comment 19: This warning wouldn't catch that problem, since this warning only fires on unbraced ifs.
See also bug 7555.
Would the following be considered under the aegis of -Wpadded?
if (something) { foo(); } if (something_else) { bar(); }
The bug in this code was that the programmer intended on writing "else if".
Examples that 2.2 warns on where it's "wrong", both from libavformat:
if(!st->codec->has_b_frames)
pktl->pkt.pts= cur_dts;
// if (st->codec->codec_type != AVMEDIA_TYPE_AUDIO) pktl->pkt.duration = duration;
(don't check in commented-out code, duh)
if (authorization && authorization[0])
av_strlcatf(str, size, "%s@", authorization);
/* Determine if hostname is a numerical IPv6 address,
* properly escape it within [] in that case. */
hints.ai_flags = AI_NUMERICHOST;
if (!getaddrinfo(hostname, NULL, &hints, &ai)) {
...
} else
/* Not an IPv6 address, just output the plain string. */
av_strlcat(str, hostname, size);
I don't know if the parser has a way to ask the preprocessor if anything was ifdef'd out between the current token and the last statement, else that should probably be done too.
prototype patch v2.2 This ignores the warning if if/for/while and following element are on the same line.
This does pretty well -- everything it fires for that I looked at is at least confusing, for example this from Harfbuzz:
if (unlikely (!c->len)) return TRACE_RETURN (false);
if (!digest->may_have (c->glyphs[0])) return TRACE_RETURN (false);
return TRACE_RETURN (dispatch (c));
...but while it finds quite a bit of questionable indentation, it hasn't found a "real" bug yet.
Output for chromium's "ui" target Patch v2.1 does pretty well, it only finds true positives as far as I can tell – but it finds quite a few, and most of them harmless. (You can use http://cs.chromium.org to look at all these files.)
For example, in icu some code accidentally got indented a column too much:
case NARROW :
if (fStandaloneNarrowMonths)
delete[] fStandaloneNarrowMonths;
fStandaloneNarrowMonths = newUnicodeStringArray(count);
uprv_arrayCopy( monthsArray,fStandaloneNarrowMonths,count);
fStandaloneNarrowMonthsCount = count;
break;
Same problem in libpng's pngread.c:
if (user_png_ver) { i = 0; do { if (user_png_ver[i] != png_libpng_ver[i]) png_ptr->flags |= PNG_FLAG_LIBRARY_MISMATCH; } while (png_libpng_ver[i++]); } else png_ptr->flags |= PNG_FLAG_LIBRARY_MISMATCH;
if (png_ptr->flags & PNG_FLAG_LIBRARY_MISMATCH)
{
...
}
(Note that everything is indented 1 relative to the first if, which is actually kind of confusing to read.)
…and I suppose it's necessary to suppress the warning if both statements are on the same line, as stuff like
while (*list && *list != c) ++list; return *list == c;
(this one from icu) is pretty common.
prototype patch v2.1 Now resets IndentStmt in more places so that this doesn't warn on
~scopedrefptr() { if (ptr) ptr_->Release(); }
T* get() const { return ptr_; }
prototype patch v2 Now doesn't warn on mixed tabs and spaces.
Another type of false positive:
if (!(ev->ev_events & (EV_READ|EV_WRITE)))
return (0);
poll_check_ok(pop);
The semicolon is more indented than the if due to the empty macro. Maybe suppress warning for empty statements?
From libpng's png_write_compressed_data_out, might actually be a bug:
if (comp->max_output_ptr != 0) png_free(png_ptr, comp->output_ptr); comp->output_ptr=NULL;
From skia:
void pop(T elem) { if (elem) elem = (*this)[fCount - 1]; --fCount; }
Not sure if the warning should be suppressed if the thing after the if is on the same line as the if as that is kind of hard to read for humans too.
False positive on re2c generated code:
{ if(cursor == s->eof) Scanner_fatal(s, "missing '}'"); s->pos = cursor; s->cline++; goto code; }
Also has mixed spaces / tabs, so ignoring lines lines like that should probably be the next change.
"False" positive on sfntly's font_data.cc:
bool FontData::Bound(int32_t offset) { if (offset > Size() || offset < 0) return false;
boundoffset += offset; return true; }
prototype patch v1 This is with the modified warning approach described above.
This triggers on
if (reply)
req->user_callback(DNS_ERR_NONE, DNS_IPv6_AAAA,
reply->data.aaaa.addrcount, ttl,
reply->data.aaaa.addresses,
req->user_pointer);
else
req->user_callback(err, 0, 0, 0, NULL, req->user_pointer);
return;
in libevent -- note mixed spaces and tabs again (at least not on the same line). Probably want to disable the warning if the whitespace before the if/for/while and the line that's warned on contain a mix of tabs and spaces.
Similar, also from nss:
if (mp_cmp_z(&f) == 0) { res = MP_UNDEF; } else for (;;) { ... } stmt;
Another false positive, from nss's ssl3con.c:
if (dh_g.len > dh_p.len || !ssl3_BigIntGreaterThanOne(&dh_g))
goto alert_loser;
rv = ssl3_ConsumeHandshakeVariable(ss, &dh_Ys, 2, &b, &length);
(Note mixed tabs and spaces on the same line 9_9)
Also sometimes seen in nss:
if (foo()) do ... while (...); stmt;
(would be fixed by checking if thing-after-if has higher indent than if)
prototype patch v0.3 (reset IndentStmt at the start of functions too)
Another false positive, from nss's sslcon.c: if (certLen + responseLen + SSL_HL_CLIENT_CERTIFICATE_HBYTES
ss->gs.recordLen) { / prevent overflow crash. / rv = SECFailure; } else rv = ssl2_HandleClientCertificate(ss, data[1], data + SSL_HL_CLIENT_CERTIFICATE_HBYTES, certLen, data + SSL_HL_CLIENT_CERTIFICATE_HBYTES + certLen, responseLen); if (rv) { rv2 = ssl2_SendErrorMessage(ss, SSL_PE_BAD_CERTIFICATE); SET_ERROR_CODE goto loser; }
Looks like not indenting the body of an if or else is pretty common. Next approach is to check if the thing after an if/else/for/while has a higher indent than the if/else/for/while itself and warning on that. That'd fix most of the false positives above.
More false positives:
From libpng's pngset.c:
if (png_ptr != NULL)
png_ptr->asm_flags = 0;
asm_flags = asm_flags; /* Quiet the compiler */
Lots of stuff similar to this in libpng's pngget.c:
if (png_ptr != NULL && info_ptr != NULL)
if (info_ptr->valid & PNG_INFO_oFFs) { png_debug1(1, "in %s retrieval function", "png_get_y_offset_microns");
if (info_ptr->offset_unit_type != PNG_OFFSET_MICROMETER)
return (0);
else
return (info_ptr->y_offset);
}
return (0);
return (0);
From nspr's prsystem.c:
if (cmd == PR_SI_HOSTNAME_UNTRUNCATED)
break;
/*
* On some platforms a system does not have a hostname and
* its IP address is returned instead. The following code
* should be skipped on those platforms.
*/
/* Return the unqualified hostname */
while (buf[len] && (len < buflen)) {
if (buf[len] == '.') {
buf[len] = '\0';
break;
}
len += 1;
}
True (harmless) positive from yasm's code.c:
for(i = 0; i < g->nSpans; ++i)
if(g->span[i].to != def)
*(t++) = &g->span[i];
if (dFlag)
fputs("\tYYDEBUG(-1, yych);\n", o);
prototype patch v0.2 With silly hack to not warn on nested if-elses like so:
if (1) { if (1) goto fail; else if (1) goto fail; else if (1) goto fail; else goto fail; } else if (1) { if (1) goto fail; }
(probably want to have some kind of stack in a real patch)
Also incorrectly warns about this from dtoa.cc (Honor_FLT_ROUNDS is not defined):
if (bc.dsign)
break; /* Must use bigcomp(). */
if (bc.rounding != 1) {
if (i < 0)
break;
}
else
{
bc.nd = nd;
i = -1; /* Discarded digits make delta smaller. */
}
Extended Description
Wouldn't it be great if clang warned on something like
if (5 != 0) goto fail; goto fail;
?
This bug tracks various ideas for warnings about this based on indent level. (There's already -Wunreachable-code for the CFG approach).
Version 1: Remember only child of unbraced ifs, fors, whiles, and warn if next statement has same indent as that child. Prototype patch attached (it doesn't reset IndentStmt often enough).
It has false positives on this code in udis86.c in mach_override:
if (ud_input_end(u)) return 0;
u->asm_buf[0] = 0;
if (ud_decode(u) == 0) return 0;
Also on this from mach_override.c, but looking at the style in the rest of the file that might actually be a (harmless) true positive: