Open stoltzmaniac opened 4 years ago
Have you tried using default arguments in the init? That seems the easiest
One option is to make the defaults class variables and then overwrite them in the init, or some other function.
Another option is to make a config class, and pass that in to configure something. Then there's a structure.
There's also multiple inheritance you can use... (Bwahahahaah!) This can get confusing because not many people do this a lot, but these can be interesting options, especially if you want to consume several different sets of defaults.
The dictionary update function can also be useful. Have the defaults set in a dictionary, have the user over-ride them with another passed in dictionary via params.update(defaults); params.update(passed_in_dict)
Can you write down some examples of three different config cases you imagine for a particular function? There's lots of ways to skin the cat here.
That makes a lot of sense, I think I'll avoid the multiple inheritance for a few more seasons :)
Greg just sent over a PR to help me out with some of these issues. I really like the idea of having a dictionary though, seems to make a lot of sense. I'll create the examples soon.
It would be awesome to be able to write something like (and not specify a seed but have it default to, let's say 555)
notice that seed=123
is missing from below (so it throws an error)
pre_process_data = PreProcessData(
predictor_vars=np.array([[0., 1., 2.], [4., 5., 6.]]),
response_var=np.array([1., 2.]),
scale_type="normalize",
train_split=0.8
)
Another case would be when using PCA. Since you almost always normalize variables, it would be great to have it be default only in that case...
Notice that scale_type='normalize'
is missing, so it throws an error
from models.dimensionizers import PrincipalComponentAnalysis
from sklearn.datasets import load_iris
iris = load_iris()
data_x = iris['data']
data_y = iris['target']
pca = PrincipalComponentAnalysis(
predictor_vars=data_x,
response_var=data_y,
train_split=0.7,
seed=123,
variance_explained_cutoff=0.95
)
The correct results for those are shown in the readme.
This keeps going on and on, learning_rate
should have a default within LinearRegression
and so forth. One thing I just discussed with Greg was that the super()__init__
method issues... they are ordered only, not keyword arguments, so I have to keep a consistent order every single time I inherit within the next class. That isn't a great experience, especially to debug
https://github.com/stoltzmaniac/Basic-ML-OOP/tree/master/01-Regression/Linear-Regression/Part-02
I have the models folder written up where
data_handlers.py
handles a bunch of the different inputs and have broken everything into classes. Then I have a separatedimensionalizers.py
which does PCA. I also createdregression.py
to handle regression.What I'm having trouble with is inheritance (OOP is not my strongest card). How can I improve this and utilize
super()
more efficiently? Currently, I cannot assign default values, I want the user to be able to input this at any point but have defaults set... I would also not like to force every single parameter to be input in the same order. I know this could be accomplished through*args
and/or**kwargs
but could really use a hand.