p-ranav / csv2

Fast CSV parser and writer for Modern C++
MIT License
551 stars 95 forks source link

Empty line ends on segfault #32

Open jdreo opened 2 years ago

jdreo commented 2 years ago

If the CSV file has an empty line, the parsing ends on a segfault. At least the trailing newline should either be parsed out silently, or give raise to an explicit error rather than a segmentation fault.

Minimal example (can be copy/pasted into test/main.cpp):

TEST_CASE("Parse a SCSV string with column headers and trailing newline, using iterator-based loop" *
        test_suite("Reader")) {

  Reader<delimiter<' '>, quote_character<'"'>, first_row_is_header<true>> csv;
  const std::string buffer = "a b\nd 2 3\ne 5 6.7\n";

  csv.parse(buffer);

  const std::vector<std::string> expected_row_names{"d", "e"};
  const std::vector<double> expected_cell_values{2, 3, 5, 6.7};

  size_t rows=0, cells=0;
  for (auto row : csv) {
    auto icell = std::begin(row);
    std::string rname;
    (*icell).read_value(rname); // FIXME an operator-> would be expected to exists.
    REQUIRE(rname == expected_row_names[rows]);
    rows++;

    ++icell; // FIXME a postfix operator++ would be expected.
    for (; icell != std::end(row); ++icell) {
      std::string str;
      (*icell).read_raw_value(str);
      const double value = std::atof(str.c_str());
      REQUIRE(value == expected_cell_values[cells]);
      cells++;
    }
  }
  size_t cols = cells / rows;
  REQUIRE(rows == 2);
  REQUIRE(cols == 2);
}

Note that the code above advertises a use-case that was not documented: parsing a table having row headers. In that case, using iterator-based loops makes sense. But the CellIterator interface —while functional— lacks the expected interface: operator-> and a postfix operator++.

jdreo commented 2 years ago

At the very least, the documentation should mention iterator-based loops and the fact that the first iterator should be tested for the end of row. In which case, adding an operator== to CellIterator would be expected.