stevenpawley / recipeselectors

Additional recipes for supervised feature selection to be used with the tidymodels recipes package
https://stevenpawley.github.io/recipeselectors/
Other
55 stars 7 forks source link

Possible new step: FCBF #5

Open rowanjh opened 2 years ago

rowanjh commented 2 years ago

Hi! I have been using the fast correlation-based filter algorithm in one of my projects, and made a package to implement this as a recipe step. It uses the bioconductor FCBF package as an engine, and brings it into the step_fcbf() function. I am new to package development so I'm sure there are things I can still tidy up, but it seemed like it would fit well with the recipeselectors package. Is this of interest?

stevenpawley commented 2 years ago

Hello, sorry for taking a while to get back to you. This sounds like it would be it nice addition to recipeselectors and I'm happy to help to work it into the package. First, however, I'm trying to figure out what the best approach is to handle BioConductor dependencies, particularly if we want to submit to CRAN at some point. There is some discussion here https://github.com/ropensci/dev_guide/issues/210, I just haven't had chance to digest it properly yet...

rowanjh commented 2 years ago

Great! In that case perhaps I'll tidy things up a bit in the package and see if there is anything I can do to make it more consistent with the other functions in recipeselectors.

Yes I was wondering about the Bioconductor dependency... I'll also look into whether I can change the engine to a CRAN-based package instead e.g. it looks like the package 'Biocomb' is an option, with Biocomb::select.fast.filter.

If I can sort that out, then I suppose the next thing is to submit a pull request and we go from there?

stevenpawley commented 1 year ago

Hello! Sorry for the very long hiatus but I finally had a little time and I took the liberty of taking your step_fcbf and including it within recipeselectors/colino. Hope that is ok and I really like your idea of dealing with the 'outcome' argument, which I'm going to extend across all of the other steps. I've also quoted the FCBF::fscb function so that the package does not depend on a bio conductor package. Tests appear to pass but if this is ok, would you mind taking a look and checking that all is working as expected?

stevenpawley commented 1 year ago

I should say that maybe additional discussion should be continued at https://github.com/stevenpawley/colino which is the new package name