marian-nmt / marian-dev

Fast Neural Machine Translation in C++ - development repository
https://marian-nmt.github.io
Other
247 stars 123 forks source link

GCC 12 compilation warning: withCommas integer wraparound #990

Open kpu opened 1 year ago

kpu commented 1 year ago

g++ 12 is complaining about a potential integer wraparound in this code:

// format a long number with comma separators
std::string withCommas(size_t n) {
  std::string res = std::to_string(n);
  for(int i = (int)res.size() - 3; i > 0; i -= 3)
    res.insert(i, ",");
  return res;
}

I think it's reasoning that res.size() could cast to a very negative integer. Then res.size() -3 would wrap around to a very large integer resulting in an i that overflows insert. This isn't possible because std::to_string(size_t) will only produce relatively short strings, but the compiler isn't smart enough to know that. So we get a warning.

g++ --version
g++ (Gentoo Hardened 12.2.1_p20230304 p13) 12.2.1 20230304
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/string:40,
                 from /home/kpu/marian-dev/src/common/utils.h:3,
                 from /home/kpu/marian-dev/src/common/utils.cpp:1:
In static member function ‘static std::char_traits<char>::char_type* std::char_traits<char>::copy(char_type*, const char_type*, std::size_t)’,
    inlined from ‘static void std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::_S_copy(_CharT*, const _CharT*, size_type) [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]’ at /usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/bits/basic_string.h:423:21,
    inlined from ‘std::__cxx11::basic_string<_CharT, _Traits, _Allocator>& std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::_M_replace(size_type, size_type, const _CharT*, size_type) [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]’ at /usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/bits/basic_string.tcc:532:22,
    inlined from ‘std::__cxx11::basic_string<_CharT, _Traits, _Alloc>& std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::replace(size_type, size_type, const _CharT*, size_type) [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]’ at /usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/bits/basic_string.h:2171:19,
    inlined from ‘std::__cxx11::basic_string<_CharT, _Traits, _Alloc>& std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::insert(size_type, const _CharT*) [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]’ at /usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/bits/basic_string.h:1928:22,
    inlined from ‘std::string marian::utils::withCommas(size_t)’ at /home/kpu/marian-dev/src/common/utils.cpp:210:15:
/usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/bits/char_traits.h:431:56: error: ‘void* __builtin_memcpy(void*, const void*, long unsigned int)’ accessing 9223372036854775810 or more bytes at offsets 3 and [2, 2147483645] may overlap up to 9223372036854775813 bytes at offset -3 [-Werror=restrict]
  431 |         return static_cast<char_type*>(__builtin_memcpy(__s1, __s2, __n));
      |                                        ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors