Open pat-s opened 4 years ago
It's implicitly the max(grid.size, unique(quantiles)) as you described.
I think that this behavior should be fine, since when many values are clustered at certain point, you just need fewer intervals. But I guess it would make sense to add this to the docs.
For the type=1, I am not entirely sure why I set it like this.
Why is the length of the resulting DF per feature so different when setting
grid.size = 99
?I was not able to relate the setting to the actual outcome differences by reading
?FeatureEffect
.Created on 2020-01-15 by the reprex package (v0.3.0.9001)
Edit: The following code sets the grid
https://github.com/christophM/iml/blob/54b2ce26d8d13f9a6fcd635ee00c8d4835b2cad3/R/FeatureEffect-ale.R#L17-L17
and in more detail this one
https://github.com/christophM/iml/blob/54b2ce26d8d13f9a6fcd635ee00c8d4835b2cad3/R/utils.R#L191-L192
So essentially
quantile(type = 1)
is called withprobs
being a seq with length.out set bygrid.size
.I wonder if this could make it into the argument description in the help page? Maybe one could also include the motivation for
type = 1
.The reason for the differing outcomes shown above is then caused by
https://github.com/christophM/iml/blob/54b2ce26d8d13f9a6fcd635ee00c8d4835b2cad3/R/FeatureEffect-ale.R#L16-L17
which removes duplicated values from the
quantile()
output.Regarding interpretation: Does the differing number of unique values for these features introduce a bias when interpreting the ALE plots for the specific predictors? Or is it like "20 is fine, everything greater is better but there is no bias when comparing the ALE plots of these features.".