Closed alansaid closed 10 years ago
Why is issue 54 related to this? I remember there was an issue about csv, but I cannot find it now.
If we create a complete pipeline, the data model will need to be aware of the dataset it's based on, so it will have an effect on the complete process. (Granted we won't have to rewrite anything provided that we don't remove current methods in the datamodel)
Ok, thanks. Thought it was a typo, but it makes sense.
Apache Commons CSV 1.0 has now been released. I'll start working on adding this to a data reader. http://mail-archives.apache.org/mod_mbox/commons-user/201408.mbox/%3CCAB917R%2BYoXZr-8zftqZ4hgYqFz6cgRXxqqza4gtaF7LdrKMJQw%40mail.gmail.com%3E
Apache Commons CSV 1.0 is now used in the parser in rival-core. Additional configurations (csv/tsv/json) can be added in rival-examples (see rival-examples/RandomMahoutIBRecommenderEvaluator)
The data model should support richer data sets (e.g. more than 4 columns tsv/csv files used as input).
Probably good idea to wait till https://commons.apache.org/proper/commons-csv/ reaches a stable release.
Related to #54