Addresses https://github.com/picnicml/doddle-model/issues/105. It removes a third-party dependency. It is possible to load a ~1Mx513 matrix in a few minutes (I couldn't even measure time previously). The downside is that we simply use row.split(",") which means that we are not able to parse strings with , in them (commas that don't separate columns) but I'm happy to introduce this limitation for performance benefits (we can improve later if needed).
Loading of a dataset with only numerical features should be faster than the loading of a dataset with categoricals.
Addresses https://github.com/picnicml/doddle-model/issues/105. It removes a third-party dependency. It is possible to load a
~1Mx513
matrix in a few minutes (I couldn't even measure time previously). The downside is that we simply userow.split(",")
which means that we are not able to parse strings with,
in them (commas that don't separate columns) but I'm happy to introduce this limitation for performance benefits (we can improve later if needed).Loading of a dataset with only numerical features should be faster than the loading of a dataset with categoricals.