Closed mikeDTI closed 3 years ago
💀☕️
On Thu, Aug 6, 2020 at 5:42 PM Mary B. Makarious notifications@github.com wrote:
Assigned #10 https://github.com/GenoML/genoml_v2/issues/10 to @mikeDTI https://github.com/mikeDTI.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/GenoML/genoml_v2/issues/10#event-3630944315, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJTEJEJPOSDFVYOREL24PLLR7MPU5ANCNFSM4PWQCXGQ .
--
Mike A. Nalls, PhD
Data Tecnica International http://www.datatecnica.com/ Note: I check emails only in bursts ... for immediate project specific issues please use the relevant BaseCamp.
Please make sure that this is a feature request.
System information:
Describe Current Behavior/State and Recommended Feature Request: Right now feature selection in part of munge with an old extraTrees routine.
Would be good to add multiple feature selection options at train as well, to cut down on information leakage between the training and witheld samples at algorithm selection. This may have some slight ramifications at tune since all data is in the CV routine there.
Will this change the current API? How? Extra options for feature selection at train and a new feature to extract list created for tune.
Who Will Benefit from this Feature? Everyone without external datasets / resources for feature triaging.
Any Additional Information? Faraz just started watching Twin Peaks and I am proud of him.