Closed morrissharp closed 2 years ago
I'm now using a fixed seed in case1.ipynb, case2.ipynb, and case3.ipynb.
Regarding the possibility of passing parameters to the model, the aim of the train_model_fetch_results()
and train_model_plot_results()
functions is to simplify the training and testing process. The goal is that we use the same model architecture with the same parameters before and after a pre-processing step, in order to test the efficiency of the pre-processing. Since the objective is not to get the best parameters for a given model, then I just allow the user to specify different model architectures (xgboost, knn, etc.), but all with default parameters.
https://github.com/microsoft/responsible-ai-toolbox-mitigations/blob/0d69bb6db4ddf92db1870a265147e7458be0cf5f/notebooks/dataprocessing/case_study/case2.ipynb?short_path=c0a0c33#L611
The
case2.ipynb
notebook references the ability to set a seed. But, this is not available for eithersplit_data()
,train_model_plot_results()
ortrain_model_fetch_results()
. Additionally, I have noticed that that there is no possibility to pass any parameters to the model itself for instantiation/fitting (e.g. setting the number of neighbors for KNN).I am not sure whether you expect these functions to be used outside of the example notebooks. But if yes, you should consider allowing the user to set a random seed, as well as pass in model parameters, possibly using something like
*args **kwargs
.