Open ischoegl opened 1 year ago
Regarding parsing of CSV files in C++, here are some preliminary findings
<regex>
mostly uses EMCAScript, which does not support conditional matchingI feel like there has to be something in Boost to do this... Implementing a csv parser from scratch seems overkill 🤔
I feel like there has to be something in Boost to do this... Implementing a csv parser from scratch seems overkill 🤔
Wouldn't be too hard if this regex were supported by C++'s <regex>
. It may, however, be supported by <boost/regex.hpp>
... lost most of my appetite after spending more time than what seemed necessary trying to figure out how to translate the conditional to EMCAScript.
Not sure it would even resolve the problem, but I wanted to add a word of caution. boost/regex.hpp
is a compiled part of Boost, which is something we've been avoiding a dependency on, due to some of the complications involved in linking to those.
Not sure it would even resolve the problem, but I wanted to add a word of caution.
boost/regex.hpp
is a compiled part of Boost, which is something we've been avoiding a dependency on, due to some of the complications involved in linking to those.
Too bad. I just confirmed that <boost/regex.hpp>
would indeed resolve the problem 😢
PS: this is how to get the header line after opening the file ...
string line;
std::getline(file, line);
boost::regex rgx(
"(?:^|,)(?=[^\"]|(\")?)\"?((?(1)[^\"]*|[^,\"]*))\"?(?=,|$)");
vector<string> labels;
auto line_begin = boost::sregex_iterator(line.begin(), line.end(), rgx);
auto line_end = boost::sregex_iterator();
for (boost::sregex_iterator item = line_begin; item != line_end; ++item) {
boost::smatch match = *item;
labels.push_back(match.str(2));
}
The syntax would be the same for <regex>
, but the capturing string doesn't work.
We could vendor this single file, header-only CSV reader: https://github.com/ben-strasser/fast-cpp-csv-parser, or something similar.
Abstract
Recent work added HDF support to
Sim1D::save/restore
(Cantera/cantera#1385) and implementedSolutionArray::save/restore
for HDF and YAML (Cantera/cantera#1426). On the back-end,SolutionArray
handles file IO in both cases. As those methods are implemented in the C++ layer, they are portable across all API's.Adding CSV support to
SolutionArray::save/restore
in C++ to replace Python'sSolutionArray.write_csv/read_csv
is a logical extension. It can build on the existing infrastructure, and would be a good way of handling CSV support in a consistent way - which would replace the historically grown patchwork of dissimilar approaches used at the moment. One additional benefit would be to resolve Cantera/cantera#1372.Motivation
Describe the need for the proposed change:
Possible Solutions
Create versions of
SolutionArray::readEntry/writeEntry
that handle CSV. While writing is straight-forward, reading CSV will need the implementation of a suitable parser. Per https://github.com/Cantera/cantera/issues/1372#issuecomment-1370177622 by @spethSolutionArray::writeEntry
implementation is proposed in Cantera/cantera#1508 ... update: now mergedSolutionArray::readEntry
implementation is missingReferences