automl / Auto-PyTorch

Automatic architecture search and hyperparameter optimization for PyTorch
Apache License 2.0
2.33k stars 283 forks source link

Subsampling vs feature selection? #414

Open nabenabe0928 opened 2 years ago

nabenabe0928 commented 2 years ago

Since we often assume the manifold hypothesis or low-intrinsic dimensionality in most cases, I think it is better to reduce the dataset size by feature selection, i.e. the reduction over the column size, rather than subsampling, i.e. the reduction over the row size.

What do @ravinkohli think?

ravinkohli commented 2 years ago

For reference, autogluon has implemented feature pruning here