Open teju85 opened 3 years ago
After discussions with @vinaydes , it appears that unfortunately, our current approach of using no temporary memory for generating uniform feature sampling does NOT work with weighted sampling! :(
It can be made to work, but the computational cost would be too high as we are trading extra compute for zero memory.
This issue has been labeled inactive-30d
due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d
if there is no activity in the next 60 days.
This issue has been labeled inactive-30d
due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d
if there is no activity in the next 60 days.
Is your feature request related to a problem? Please describe. Current RF implementation only supports uniform subsampling of features (as of 0.18). We also need to extend this to support weighted subsampling in RF.
Describe the solution you'd like Ideally, we need to expose a
feature_weights
option in the constructor for both classifier/regressor. It's default value isNone
(aka uniform subsampling). If it is notNone
, then it must be a list of weights one for each feature in the dataset. Then, whenmax_features
is less than 1 (meaning subsampling is enabled), we need to perform either uniform or weighted subsampling, respectively.Additional context JFYI, sklearn does NOT support such an option.