hosseinmoein / DataFrame

C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types and contiguous memory storage
https://hosseinmoein.github.io/DataFrame/
BSD 3-Clause "New" or "Revised" License
2.41k stars 306 forks source link

Sort failing giving Segmentation Fault #224

Closed sierret closed 1 year ago

sierret commented 1 year ago

I was testing some of the functions and getting used to the library. Everything else seems fine but when attempting to sort I get a segmentation fault error.

`using ULDataFrame = StdDataFrame<unsigned long>;
ULDataFrame df;
df.create_column<std::string>(static_cast<const char *>("Name"));
df.create_column<int>(static_cast<const char *>("age"));

unsigned long   index_val = 0;

df.append_row(&index_val,
              std::make_pair("Name", std::string("Poo")),
              std::make_pair("age", 300));
df.write<std::ostream, std::string, double, int>(std::cout, io_format::csv2);

unsigned long val2=index_val+1;
df.append_row(&val2,
              std::make_pair("Name", std::string("Poole")),
              std::make_pair("age", 600));
df.write<std::ostream, std::string, double, int>(std::cout, io_format::csv2);

val2+=1;
df.append_row(&val2,
              std::make_pair("Name", std::string("Pooley")),
              std::make_pair("age", 900));
df.write<std::ostream, std::string, double, int>(std::cout, io_format::csv2);

string col="age";
df.sort<std::string, std::string>(col.c_str(),sort_spec::desce);`
hosseinmoein commented 1 year ago

This probably should fail more gracefully. But you are specifying the types for sort in a wrong way. For sort, first specify the type(s) of column(s) used for sorting then all the types of the DataFrame So your sort line should be

 df.sort<int, std::string, int>(col.c_str(),sort_spec::desce);
sierret commented 1 year ago

Oh I understand. It works now thanks.

I may need to open an new issue for this as well, but you might consider adding a absolute value sort, i.e, sort in asc or desc based on the absolute value of a column. It should be trivial to do this with already implemented sort functions by just performing an asc and desc sort on two dataframes and then joining them by row based on column absolute value.

hosseinmoein commented 1 year ago

I will put it on my todo list. But yes please open a new issue