Ekeany / Boruta-Shap

A Tree based feature selection tool which combines both the Boruta feature selection algorithm with shapley values.
MIT License
559 stars 86 forks source link

fixed find_sample risk of infinite loop #78

Closed adalseno closed 1 year ago

adalseno commented 2 years ago

What does this PR do?

It looks like the original find_sample does not increment iteration so there is a risk of an endless loop. I have added the line iteration+=1 References

Testing performed

No testing performed

Known issues

Apparently no issue has been raised yet, but it may happen

adalseno commented 2 years ago

Hi, I just realised there is another issue:

def find_sample(self):
        '''
        Finds a sample by comparing the distributions of the anomally scores between the sample and the original
        distribution using the KS-test. Starts of a 5% howver will increase to 10% and then 15% etc. if a significant sample can not be found
        '''
        loop = True
        iteration = 0
        size = self.get_5_percent_splits(self.X.shape[0])
        element = 1

to start with 5% we should set element = 0 otherwise we take 10%. Shall I open a new pull request or will you fix it manually?