vincentlaucsb / csv-parser

A high-performance, fully-featured CSV parser and serializer for modern C++.
MIT License
901 stars 150 forks source link

Compiler warning with gcc version 13.2.0 with -O3 #234

Closed rajgoel closed 3 months ago

rajgoel commented 3 months ago

Thanks a lot for the great tool!

I just stumbled across some strange behavior when turning optimization to -O3 with gcc version 13.2.0. I receive the following compiler output:

In file included from /usr/include/c++/13/algorithm:60,
                 from /some_path/build/_deps/csv-src/single_include/csv.hpp:37,
                 from /some_path/LookupTable.h:4,
                 from /some_path/LookupTable.cpp:1:
In static member function ‘static constexpr _Up* std::__copy_move<_IsMove, true, std::random_access_iterator_tag>::__copy_m(_Tp*, _Tp*, _Up*) [with _Tp = const char; _Up = char; bool _IsMove = false]’,
    inlined from ‘constexpr _OI std::__copy_move_a2(_II, _II, _OI) [with bool _IsMove = false; _II = const char*; _OI = char*]’ at /usr/include/c++/13/bits/stl_algobase.h:506:30,
    inlined from ‘constexpr _OI std::__copy_move_a1(_II, _II, _OI) [with bool _IsMove = false; _II = const char*; _OI = char*]’ at /usr/include/c++/13/bits/stl_algobase.h:533:42,
    inlined from ‘constexpr _OI std::__copy_move_a(_II, _II, _OI) [with bool _IsMove = false; _II = __gnu_cxx::__normal_iterator<const char*, vector<char> >; _OI = char*]’ at /usr/include/c++/13/bits/stl_algobase.h:540:31,
    inlined from ‘constexpr _OI std::copy(_II, _II, _OI) [with _II = __gnu_cxx::__normal_iterator<const char*, vector<char> >; _OI = char*]’ at /usr/include/c++/13/bits/stl_algobase.h:633:7,
    inlined from ‘static _ForwardIterator std::__uninitialized_copy<true>::__uninit_copy(_InputIterator, _InputIterator, _ForwardIterator) [with _InputIterator = __gnu_cxx::__normal_iterator<const char*, std::vector<char> >; _ForwardIterator = char*]’ at /usr/include/c++/13/bits/stl_uninitialized.h:147:27,
    inlined from ‘_ForwardIterator std::uninitialized_copy(_InputIterator, _InputIterator, _ForwardIterator) [with _InputIterator = __gnu_cxx::__normal_iterator<const char*, vector<char> >; _ForwardIterator = char*]’ at /usr/include/c++/13/bits/stl_uninitialized.h:185:15,
    inlined from ‘constexpr _ForwardIterator std::__uninitialized_copy_a(_InputIterator, _InputIterator, _ForwardIterator, allocator<_Tp>&) [with _InputIterator = __gnu_cxx::__normal_iterator<const char*, vector<char> >; _ForwardIterator = char*; _Tp = char]’ at /usr/include/c++/13/bits/stl_uninitialized.h:373:37,
    inlined from ‘constexpr std::vector<_Tp, _Alloc>::vector(const std::vector<_Tp, _Alloc>&) [with _Tp = char; _Alloc = std::allocator<char>]’ at /usr/include/c++/13/bits/stl_vector.h:603:31,
    inlined from ‘constexpr csv::CSVFormat::CSVFormat(const csv::CSVFormat&)’ at /some_path/build/_deps/csv-src/single_include/csv.hpp:4917:11,
    inlined from ‘csv::CSVGuessResult csv::internals::_guess_format(csv::string_view, const std::vector<char>&)’ at /some_path/build/_deps/csv-src/single_include/csv.hpp:7417:46:
/usr/include/c++/13/bits/stl_algobase.h:437:30: error: ‘void* __builtin_memcpy(void*, const void*, long unsigned int)’ forming offset 1 is out of the bounds [0, 1] [-Werror=array-bounds=]
  437 |             __builtin_memmove(__result, __first, sizeof(_Tp) * _Num);
      |             ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors

I do not get any warnings with the -O2 flag.

I can compile if I add -Wno-array-bounds and -Wno-stringop-overread to my compile options. The code appears to work, but I wonder whether this is a safe thing to do?

Maybe there is something wrong in my code that leads to the problem:

LookupTable::LookupTable(const std::string& filename) {
  csv::CSVReader reader = openCsv(filename);
  for (auto &row : reader) {
    data.push_back(std::move(row));
  }
}

csv::CSVReader LookupTable::openCsv(const std::string& filename) {
  csv::CSVFormat format;

  // First, try to open the file using the given filename.
  if (std::filesystem::exists(filename)) {
    // determine delimiter from file
    auto delimiter = csv::guess_format(filename).delim;
    (delimiter == '\t') ? format.trim({' '}) : format.trim({' ', '\t'});
    csv::CSVReader reader = csv::CSVReader(filename, format);
    return reader;
  }

  // If the file is not found with the given filename, try each folder in the list.
  for (const std::string& folder : folders) {
    std::filesystem::path fullPath = std::filesystem::path(folder) / filename;
    if (std::filesystem::exists(fullPath)) {
      // determine delimiter from file
      auto delimiter = csv::guess_format(fullPath.string()).delim;
      (delimiter == '\t') ? format.trim({' '}) : format.trim({' ', '\t'});
      csv::CSVReader reader = csv::CSVReader(fullPath.string(), format);
      return reader;
    }
  }

  // If the file is not found in any of the folders, throw an exception or handle the situation as needed.
  throw std::runtime_error("CSV file not found.");
}

These are the relevant parts from my cmake configuration:

# CSV Parser
FetchContent_Declare(
        csv
        GIT_REPOSITORY https://github.com/vincentlaucsb/csv-parser.git
        GIT_TAG 2.2.3 # same issue with 2.1.3
)
  set(CMAKE_CXX_STANDARD 23) # same issue with 20
  target_compile_options(${PROJECT_NAME}
    PRIVATE
    -O3
    -Werror
    -pedantic-errors
    -Wpedantic
    -Wall
    -Wextra
    -Wconversion
    -Wsign-conversion
    -fmax-errors=1 # Remove
    -Wno-error=unused-parameter # Remove
    -Wno-error=unused-variable # Remove
# Need to add below lines to suppress warnings for CSV Parser (not necessary for -O2)
    -Wno-array-bounds 
    -Wno-stringop-overread
  )

Might be the same issue as #233

rajgoel commented 3 months ago

Sorry, didn't check the closed issues. This looks like a duplicate of #231.

rajgoel commented 3 months ago

I just looked at the code at https://github.com/vincentlaucsb/csv-parser/blob/ade1e985bdf95ed209e8592a4c61689231e0f4ed/single_include/csv.hpp#L7417 that causes the issue.

If I change

CSV_INLINE GuessScore calculate_score(csv::string_view head, CSVFormat format);

to

CSV_INLINE GuessScore calculate_score(csv::string_view head, const CSVFormat& format);

the problem is solved for me. Any reason to not use const CSVFormat& format?