... we permuted the feature’s values, which breaks the relationship between the feature and the true outcome.
However, I think that permutation could rarely end up with a similar distribution of per-class measurements as the original data and hide the performance loss. Would a more robust method be to calculate the mean of a feature and set all samples to that value?
Section 8.5 has
However, I think that permutation could rarely end up with a similar distribution of per-class measurements as the original data and hide the performance loss. Would a more robust method be to calculate the mean of a feature and set all samples to that value?