AlexMili / torch-dataframe

Utility class to manipulate dataset from CSV file
MIT License
67 stars 8 forks source link

Column order #5

Closed gforge closed 8 years ago

gforge commented 8 years ago

As the network expects labels to appear in a certain order it is important that the tensor follows the exact structure of the CSV-file. From my understanding the key order in Lua is undefined and it may therefore be beneficial to add a order that is derived from the CSV-file.

Implementation details

By adding a self.column_order to the file that is populated by load_csv/load_table we can make sure that the columns in the to_tensor are always in the same order. We could use the csvigo fromcsv function as a source of inspiration. The head, tail, and show functions should of course respect the column order.

AlexMili commented 8 years ago

Yes I see but do you have any idea how to get the right order in the Dataframe class from the cvigo lib ? Or you just want to keep the right order between the Dataframe lib and the tensor file ?

gforge commented 8 years ago

Good point, this could cause a lot of frustration. I guess it's best not to rely on csvigo's save() but write our own. The code that csvigo uses is trivial

AlexMili commented 8 years ago

Totally ! We also could make a pull request to the main repo and use our custom version waiting for the pull request to be accepted

gforge commented 8 years ago

I think the to_csv will have to rely on our internal column_order, this would require that csvigo accepts a column order argument - it seems like a long-shot since it wouldn't be used anywhere else