Implement support for sparse feature data

For instance if all the data is passed as a scipy.sparse.csc_matrix (e.g. after one hot encoding).

Pandas as support for sparse features: http://pandas.pydata.org/pandas-docs/stable/sparse.html

In particular it has dedicated datastructure for 1D sparse data: SparseArray.

There is also: https://github.com/pydata/sparse and I believe the ecosystem will converge at some point. I would be in favor of leveraging the datastracture from Pandas to start with the most adopted solutions that allows for heterogeneously typed features (a fix of dense and sparse columns, categorical or numerical).

ogrisel / pygbm

Implement support for sparse feature data #26