Open pennyfx opened 6 years ago
@pennyfx True, configs should be created at the top followed by using the same throughout the notebook. That would decrease the chances of error by data scientist and help track non-default components used for the algorithm. I will update the same in the tutorial.
datmo-tutorials/face-recognition/experimentation.ipynb
The code sections where config and stats are written don't have any explanation.
Also,
n_jobs
is actually being used above the config object, so either define a variable above and use it in both locations so that there is a relationship betweenconfig.n_jobs
and where it's actually used above ORRRR move this entire config object near the top of the file and write it ASAP. IMO, creating and writing the config object near the top of the model makes the most sense, because you can use those config values instead of magic numbers for the rest of the code.There are probably more "magic numbers", that can be turned into config vars as well.
Examples:
df['is_train'] = np.random.uniform(0, 1, len(df)) <= .60
.60 is a magic number, move to configRandomForestClassifier(
most of the parameters in this classifier could also be turned into config varsThe purpose of config is to create a single place to "tweak config values" and end up with different results. A developer should never have to change the same magic number in two places.
Same is true for KNN classifier variable
n_neighbors
.