Open RahulDubey391 opened 7 months ago
@RahulDubey391 Thank you for your efforts. The main thing missing is the implementation of inverse_transform
using snowpark dataframes.
Our codebase has a system for autogenerating scikit-learn wrapper classes that allow users to execute fits/transforms/etc with snowpark dataframes or pandas dataframes. For inverse_transform
we would like to implement it in this way, and add it to wrapper classes if the underlying scikit-learn base estimator has an inverse_transform
method. Take a look at the codegen/
directory for the templates and autogen logic. The autogenerated code is built via bazel.
We realize this could be daunting, but we'd like to support you if you are still interested in contributing. We could greatly improve the contribution guide in CONTRIBUTING.md
.
The other thing about inverse_transform
is that when using snowpark dataframes for transformations, the input columns are retained by default. The user may want to discard them and can set drop_input_cols=True
. But by default since the input columns are retained, there is limited utility for inverse_transform
.
Hope this makes sense, let us know if you have questions.
@RahulDubey391 Thank you for your efforts. The main thing missing is the implementation of
inverse_transform
using snowpark dataframes.Our codebase has a system for autogenerating scikit-learn wrapper classes that allow users to execute fits/transforms/etc with snowpark dataframes or pandas dataframes. For
inverse_transform
we would like to implement it in this way, and add it to wrapper classes if the underlying scikit-learn base estimator has aninverse_transform
method. Take a look at thecodegen/
directory for the templates and autogen logic. The autogenerated code is built via bazel.We realize this could be daunting, but we'd like to support you if you are still interested in contributing. We could greatly improve the contribution guide in
CONTRIBUTING.md
.The other thing about
inverse_transform
is that when using snowpark dataframes for transformations, the input columns are retained by default. The user may want to discard them and can setdrop_input_cols=True
. But by default since the input columns are retained, there is limited utility forinverse_transform
.Hope this makes sense, let us know if you have questions.
Thanks a lot @sfc-gh-thoyt for the guidance. Yes I am still interested in contributing and would like to proceed further to explore codegen. I'll go through it and will raise doubts if any comes up.
Hi, I have added the inverse_transform() class methods for the BaseTransformer. What I understood is the method from Sklearn library is overloaded with different signature.
I need more help to understand if my assumption is correct. Also I see all the other preprocessing classes needs to have this method added.