stephenberry / glaze

Extremely fast, in memory, JSON and interface library for modern C++
MIT License
1.22k stars 121 forks source link

Request an option of disabling column names of csv output and writing in rowwise structure #1305

Open YanzhaoW opened 2 months ago

YanzhaoW commented 2 months ago

Hi,

I would like to request two features regarding csv writing.

First, it would be really nice if users can disable printing the column names. In some situations, the whole data is not available and new value of the structure are read and written inside an event loop. But when it's transformed to a string, the output string contains the column names each time. But a correct csv file only has the column names once on the top.

For example:

struct CsvStruct {
    std::vector<int> header1 {1, 2, 3};
    std::vector<float> header2 {4., 5, 6};
    std::vector<std::string> header3 {"a", "b", "c"};
};

auto main() -> int {
    auto my_csv = CsvStruct{};
    auto sstream = std::stringstream{};
    auto buffer = std::string{};
    auto ec =
        glz::write<glz::opts{.format = glz::csv, .layout=glz::colwise}>(
            my_csv, buffer);
    sstream << buffer;
    ec =
        glz::write<glz::opts{.format = glz::csv, .layout=glz::colwise}>(
            my_csv, buffer);
    sstream << buffer;
    std::print("{}", sstream.str());
    return 0;
}

outputs a string:

header1,header2,header3
1,4,a
2,5,b
3,6,c
header1,header2,header3
1,4,a
2,5,b
3,6,c

which is an ill-formatted csv file.

The second request is whether we could output to a csv string from a vector of struct. In most of cases, each row in a csv file represents a data point and it's very normal to have something like std::vector<DataPoint>. So it would be greate to have an API like:

struct CsvStruct{
    int header1 = 1;
    float header2 = 2.;
    std::string header3 = "a";
};

auto main() -> int {
    auto my_csv = std::vector<CsvStruct>{};
    my_csv.emplace_back();
    auto buffer = std::string{};
    auto ec =
        glz::write<glz::opts{.format = glz::csv, .layout=glz::rowwise}>(
            my_csv, buffer);
    return 0;
}

Many thanks in advance

stephenberry commented 2 months ago

Thanks for your suggestions. I've had an issue for a while about supporting CSVs without column or row keys (#853), so this is extra motivation to get that done.

Your example of std::vector<DataPoint> is also a good suggestion.

I'm not sure when I'll get to these, because I'm making other improvements to Glaze right now. But, I'll keep this issue alive until these features are added.