IIIS-Li-Group / OpenFE

OpenFE: automated feature generation with expert-level performance
MIT License
781 stars 99 forks source link

The stage 2 didn't actually select the features! #56

Closed wangxk15 closed 1 week ago

wangxk15 commented 1 week ago

After checking the code, I find it different between the description in the paper and the code with respect to the stage II selection step.

The paper said that "we only select the top-ranked candidate features to improve the generalization of our algorithm."

But in the code, the function stage2_select() just sort the ofe.candidate_features_list based on the feature importance given by the lgbm, the length of the new_features would be the same as the ofe.candidate_features_list.

Is there anthing I'm missing, or the author did it on purpose and we can set a threshold in the stage II to do the selection?

Thanks.