Open jfroy opened 7 years ago
This poses a bigger question of whether Abseil should be -Wsign-conversion clean. I can easily write a static_cast here, but there are still a bunch of other warnings
absl/strings/escaping.cc: In function 'int absl::{anonymous}::hex_digit_to_int(char)': absl/strings/escaping.cc:67:32: warning: conversion to 'unsigned char' from 'char' may change the sign of the result [-Wsign-conversion] assert(absl::ascii_isxdigit(c)); ^ absl/strings/escaping.cc: In function 'bool absl::{anonymous}::CUnescapeInternal(absl::string_view, bool, char, ptrdiff_t, std::string)': absl/strings/escaping.cc:132:32: warning: conversion to 'unsigned int' from 'int' may change the sign of the result [-Wsign-conversion] unsigned int ch = p - '0'; ^ absl/strings/escaping.cc:133:71: warning: conversion to 'unsigned int' from 'const char' may change the sign of the result [-Wsign-conversion] if (p < last_byte && is_octal_digit(p[1])) ch = ch 8 + ++p - '0'; ^ absl/strings/escaping.cc:135:30: warning: conversion to 'unsigned int' from 'const char' may change the sign of the result [-Wsign-conversion] ch = ch 8 + ++p - '0'; // now points at last digit ^ absl/strings/escaping.cc:139:55: warning: conversion to 'std::basic_string
::size_type {aka long unsigned int}' from 'long int' may change the sign of the result [-Wsign-conversion] std::string(octal_start, p + 1 - octal_start) + ^ absl/strings/escaping.cc:148:46: warning: conversion to 'size_t {aka long unsigned int}' from 'ptrdiff_t {aka long int}' may change the sign of the result [-Wsign-conversion] memcpy(d, octal_start, octal_size); ^ absl/strings/escaping.cc:160:48: warning: conversion to 'unsigned char' from 'char' may change the sign of the result [-Wsign-conversion] } else if (!absl::ascii_isxdigit(p[1])) { ^ absl/strings/escaping.cc:166:60: warning: conversion to 'unsigned char' from 'char' may change the sign of the result [-Wsign-conversion] while (p < last_byte && absl::ascii_isxdigit(p[1])) ^ absl/strings/escaping.cc:168:51: warning: conversion to 'unsigned int' from 'int' may change the sign of the result [-Wsign-conversion] ch = (ch << 4) + hex_digit_to_int(++p); ^ absl/strings/escaping.cc:171:69: warning: conversion to 'std::basic_string ::size_type {aka long unsigned int}' from 'long int' may change the sign of the result [-Wsign-conversion] error = "Value of \" + std::string(hex_start, p + 1 - hex_start) + ^ absl/strings/escaping.cc:180:42: warning: conversion to 'size_t {aka long unsigned int}' from 'ptrdiff_t {aka long int}' may change the sign of the result [-Wsign-conversion] memcpy(d, hex_start, hex_size); ^ absl/strings/escaping.cc:194:53: warning: conversion to 'std::basic_string::size_type {aka long unsigned int}' from 'long int' may change the sign of the result [-Wsign-conversion] std::string(hex_start, p + 1 - hex_start); ^ absl/strings/escaping.cc:200:42: warning: conversion to 'unsigned char' from 'char' may change the sign of the result [-Wsign-conversion] if (absl::ascii_isxdigit(p[1])) { ^ absl/strings/escaping.cc:201:57: warning: conversion to 'unsigned int' from 'int' may change the sign of the result [-Wsign-conversion] rune = (rune << 4) + hex_digit_to_int(*++p); // Advance p. ...
and so on for a while. I think this is the reason that we don't use this option internally -- by the time we started compiling with -Werror there were way too many of these to fix.
As a simple experiment to look at the size of the problem, which is large:
$ bazel clean; bazel build absl/... --copt=-Wsign-conversion 2>/tmp/sign-warnings
$ wc /tmp/sign-warnings -l
5466 /tmp/sign-warnings
$ bazel clean; bazel build absl/... 2>/tmp/nosign-warnings
$ wc /tmp/nosign-warnings -l
15 /tmp/nosign-warnings
Like I said internally - I could swear we did some of the sign conversion cleanup before release, so I'm surprised that this is busted. But at the same time, with those counts, it's gonna be a big silly/messy cleanup to resolve it all. I'm sorta torn.
We have a -Weverything -Werror -Wno-... codebase, so we're used to dealing with these sorts of issues. Our blanket hammer is to flag a subtree as system headers. In some cases we'll add push/pop diagnostics. tl;dr; we can work around this for sure and it's not blocking us. Perhaps making the headers clean may be worth considering, if not too big an effort.
I am not an expert, but I assume this conversion to saturate npos is not UB?
@jfroy - As a library that expects to be at/near the bottom of the dependency stack, I think we ought to be as warning free as possible. So I'd like to say we should just clean it up ... but bleh.
Wraparound on unsigned ints is defined, so the single item that motivated this is "fine". Largely my hesitation is that the vast majority of the hits on this warning are false-positives. (No good answer.)
Perhaps this should be marked as low hanging fruit for the community to help out while you guys can spend time on more impactful matters.
@brenoguim to clarify Titus' statement, we probably don't want to litter Abseil with static_casts everywhere if most of them are to clear up false positives from the compiler. Automatically compiling Abseil TUs with +Wsign-conversion or -Wnosign-conversion (can't remember off the top of my head how to disable a warning) seems wrong too, however.
Oh yes, of course. I just figured there might be ways to rework the code so these conversions do not happen or happen in such way the compiler doesn't complain. But I haven't looked into the scenarios, so I wouldn't know. Also it's not always possible or would require worse code juggling than simply adding the static_cast.
absl/strings/escaping.cc:67:32: warning: conversion to 'unsigned char' from 'char' may change the sign of the result [-Wsign-conversion] assert(absl::ascii_isxdigit(c));
This one already has a static_cast on the following line (escaping.cc:68), so it could be possible to rework the code a bit and reuse that static_cast.
absl/strings/escaping.cc: In function 'bool absl::{anonymous}::CUnescapeInternal(absl::string_view, bool, char, ptrdiff_t, std::string)': absl/strings/escaping.cc:132:32: warning: conversion to 'unsigned int' from 'int' may change the sign of the result [-Wsign-conversion] unsigned int ch = p - '0'; ^ absl/strings/escaping.cc:133:71: warning: conversion to 'unsigned int' from 'const char' may change the sign of the result [-Wsign-conversion] if (p < last_byte && is_octal_digit(p[1])) ch = ch 8 + ++p - '0'; ^ absl/strings/escaping.cc:135:30: warning: conversion to 'unsigned int' from 'const char' may change the sign of the result [-Wsign-conversion] ch = ch 8 + ++p - '0'; // now points at last digit
These other three could use a function to avoid repeating *p - '0'
.
But anyway, I'm certain it's not possible to rework all of them, but some can, and may even lead to slightly more readable code.
Some of these will be super noisy to clean up, given that some large code bases use -funsigned-char (cough cough), and depending on that these warnings will trigger one way or the other.
This can't be the only library to suffer from this. I wonder what other people are doing. Boost seems to have a "numeric_cast" which is a static_cast with bound checking. But I'm not sure if they use internally on their own code.
Despite being a silly issue and potentially with many false positives, it sparked my interest. I don't want to believe it's not possible to write readable and warning free C++.
It is generally impossible to write readable warning free C++ for an arbitrary definition of warning. Usually compilers do not warn on system headers that are included via -isystem. That way, people can switch on whatever warnings they think make sense in their own projects without hurting downstream users from switching on their warnings.
Yeah, we use the -isystem for the boost library as well. I agree it's not possible to do a completely warning free code. On the other hand, I believe it's possible to do better. Just "giving up" on these warnings could also lead to less readable code and hide bugs.
Anyway, I'll stop guessing and try to come up with a PR that shows my point. :)
@tituswinters I propose that we follow @brenoguim's idea and mark this as "help wanted" and "good first issue" and expand the scope of the issue to just in general being about when Abseil code emits compiler warnings. Most of these are easy to fix and the only decider is whether or not the fix is something we want to accept. Furthermore a lot of these are hard to test for internally where we have tightly controlled compiler flags in our toolchain.
or you can/should just test it on kokoro once the export from Google code base is done ;)
@Mizux I'm not sure what you're suggesting. We can't have a kokoro test for every possible compiler configuration. Could you clarify?
https://github.com/abseil/abseil-cpp/blob/master/absl/types/span.h#L282