Closed mikoontz closed 2 years ago
Hi Michael, Thank you for being so interested in the package and your thoughtful and well-developed comment. I truly appreciate it! I think you are right. On failure, the_feature_engineer() should yield a list with named objects, as you suggest. I will fix that in the development version ASAP.
Do you think it'd be alright if the output $data slot carried the original data if the function cannot find any meaningful interactions? I think that'd facilitate running automated workflows, but I'd be happy to hear what you think about it.
Cheers, Blas
I updated the function in the development branch of the repo. Please, try it when you can, and let me know if it works as you'd like!
I'll give a try when I can! And to your question, my naive thinking would be to expect the $data
column to just have a copy of the original data if there are no new interactions to add, so I think your approach sounds like the right one!
I'm really enjoying using this package! Thank you so much for writing it. I hope it's okay to chime in about a few specific details/feature requests that other users might also find useful as I'm learning to use it.
One thing I've come across as I create a workflow is that the
the_feature_engineer()
function appears to return different data types depending on whether promising interactions are found. If promising interactions are found, a list is returned. If no promising interactions are found,NA
is returned.For my use case, anyway, it would smooth out the workflow if the returned data type were always a list with some of the named list elements being NULL if they are not applicable, but others getting filled in if possible. Particularly the
$data
and$predictor.variable.names
list elements.The tutorial (which is great) currently uses the following code block to "update" the data and predictor variable names which will be passed to the actual call to build the random forest model:
But these lines won't work if
the_feature_engineer()
has returnedNA
, implying no promising interactions.I suppose a user could always use an
if(is.na(feature_engineer_returned_results))
to either update the data/predictor variable names or not. Maybe that's better and/or your intention? What do you think?