hosseinmoein / DataFrame

C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types and contiguous memory storage
https://hosseinmoein.github.io/DataFrame/
BSD 3-Clause "New" or "Revised" License
2.41k stars 306 forks source link

append row and visitor calculate unexpected #241

Closed kkonghao closed 1 year ago

kkonghao commented 1 year ago

hi when i use append_row to append a row to an exist dataframe and calculate mean visitor, some unexpected result occur and some other visitor neither.

static void test_append_row_with_visitor()
{
    std::cout << "\nTesting SEMVisitor{ } ..." << std::endl;

    StlVecType<unsigned long>  idx =
    { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
      21, 22, 23, 24, 25, 26, 27, 28, 29, 31, 31, 32, 33, 34, 35, 36, 37,
      38, 39, 40 };
    StlVecType<double> d1 =
    { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
      21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37,
      38, 39, 40 };
    MyDataFrame         df;

    df.load_data(std::move(idx), std::make_pair("col_1", d1));

    MeanVisitor<double>  mean_visitor;

    const auto result = df.visit<double>("col_1", mean_visitor).get_result();
    std::cout << result << "\n"; //20.5 yes,expected

    unsigned long   index_val = 41;

    df.append_row(&index_val, std::make_pair("col_1", 100));

    MeanVisitor<double>  mean_visitor1; 
    const auto result1 = df.visit<double>("col_1", mean_visitor1).get_result();

    std::cout << result1 << "\n";// also 20.5 unexpected
}
hosseinmoein commented 1 year ago

That is very interesting. DataFrame is silently failing. What is happening is that 100 is read as an integer. So you have a std pair of string and integer. DataFrame is silently failing to find the column "col_1" of type integer. if you change the line to

    df.append_row(&index_val, std::make_pair("col_1", 100.0));

It should work

kkonghao commented 1 year ago

thanks for your help, i get it