Ekrekr / ldr

LDR stands for Latent Dimensionality Reduction. It is a generic method for interpreting models. It is deployed here as a python module.
GNU General Public License v3.0
1 stars 0 forks source link

Binning data with equal intervals drastically reduces efficacy of prediction imputation. #1

Open Ekrekr opened 4 years ago

Ekrekr commented 4 years ago

When imputing features this dramatically reduces accuracy, as if all the samples are in a single bin of the linearly spaced intervals, then it contributes nothing to the prediction.

When visualizing features, this makes only a small portion of the graph actually be colored.

In order to fix the issue with the graph:

Neither of these options work for space like 'o'. In this case a transformation would be required to make it easy to view on a graph, such as a Fourier tranform. I am not a fan of this as it would stretch the data. Further investigation is required.

Ekrekr commented 4 years ago

On top of this feature prediction is currently only implemented for missing columns, rather than for variables missing on each row.