Closed DMTSource closed 1 year ago
Hi Derek,
Thank you for providing the reproducing examples. I think the issue here is that GridSearchCV
as implemented by sklearn is meant for single inputs. I realize that other than the rather obscure comment in SklearnWrapper
quoted below it is not obvious that you cannot pass muti-input/multi-outputs when using SklearnWrapper
+ GridSearchCV
. I'll improve the docs to make this more obvious.
class SKLearnWrapper:
"""Wrapper utility class that allows models to used in scikit-learn's
``GridSearchCV`` API. It follows the style of Keras' own wrapper.
A future release of **baikal** plans to remove this class and instead
include a custom ``GridSearchCV`` API, based on the original scikit-learn
implementation, that can handle baikal models natively.
"""
In the meantime I think you can work around it by merging the multiple inputs (and multiple outputs, if any) before feeding them into the model, and then and doing the splitting within model with Split
and then Stack
-ing the outputs.
Thank you for the quick response! I will give the workaround a try as that sounds like a simple/great solution!
Closing due to inactivity. Feel free to reopen if you need further help.
I am attempting to switch a working model based on _readme_longexample to use a GridSearchCV fit, but when I apply the fit, the gscv does not appear to like my multiple inputs and gives a new error(before I was able to fit and predict my model):
For example: gscv.fit([X1_train, X2_train, X3_train], Y1_train, **fit_params)
How to reproduce it? I have modified a new example to show this behavior is occurring in the readme_long_example as well. It also gives the error that suggest the input shape is related:
ValueError: Found input variables with inconsistent numbers of samples: [2, 426]
The runnable, modified example can be found here: https://gist.github.com/DMTSource/2b38b473270a50e71025dd6cb1c03521
What versions are you using? baikal==0.4.2 scikit-learn==0.24.1 Python 3.7.6 (anaconda env)