Closed hlnicholls closed 3 years ago
Hi,
Sorry for the late response, this is a great question. You are completely correct the maximum shadow feature does serve as the cutoff in the feature selection process.
The problem here is this graph can be misleading at times. As in the cutoff process we compare the features on a per round basis so one round the feature might be larger the next it could be smaller. If the feature is consistently beating the random shadow feature then we accept it (green) if not it goes (red) or if we are unsure then (blue) or tentative.
In your case the random feature has a larger average feature importance value then some of the accepted features which looks weird. Although each of these features has consistently beaten the max feature value and that is why they have been accepted.
You can see in the box plot that the random shadow feature has quite a large variance compared to the 4 accepted features. So you can picture the rounds that those 4 features would have beaten it
Hope this helps!
Thank you for your reply and sorry I'm also just seeing this now!
This completely answers my question and gives me a good understanding of the plot, thank you!
Would I be able further select features within the feature selection by boruta shap, in aim to be even more stringent, taking only those that are having a larger average feature importance than the average importance of the max shadow feature? (so essentially dropping the 4 features behind it in the plot) Or would this invalidate running borutashap in the first place?
Hi,
You can definitely apply your own filter to the feature set afterwards as nothing compares to domain knowledge or human intuition.
There are also some hyperparameters in the model for this reason as well instead of calling the deafult hyperparameters you could change them to be more strict on what features are accepted_BorutaShap(percentile=100,p_value=0.05):_
percentile: Int An integer ranging from 0-100 it changes the value of the max shadow importance values. Thus, lowering its value would make the algorithm more lenient.
_pvalue: float A float used as a significance level again if the p-value is increased the algorithm will be more lenient making it smaller would make it more strict also by making the model more strict could impact runtime making it slower. As it will be less likely to reject and accept features.
The default value of the percentile is 100 so this wont help you but you could increase the pvalue from its deafult value of 0.05 to something higher such as 0.08.
Thank you so much this is incredibly helpful! I will look into using those hyperparameters going forward, thanks again!
Within my selected features output by BorutaShap I have 4 features that are selected but according to the plot (shown above) these 4 are less important than the maximum shadow feature, how do I interpret why these features were selected? Originally I thought that the maximum shadow feature serves as the cut-off for selecting important features.