datmo / datmo-tutorials

Tutorials and training material for the Datmo Platform
MIT License
5 stars 6 forks source link

"face-recognition/experimentation.ipynb" Notebook sections where config/stats files are created need comments #1

Open pennyfx opened 6 years ago

pennyfx commented 6 years ago

datmo-tutorials/face-recognition/experimentation.ipynb

The code sections where config and stats are written don't have any explanation.

Also, n_jobs is actually being used above the config object, so either define a variable above and use it in both locations so that there is a relationship between config.n_jobs and where it's actually used above ORRRR move this entire config object near the top of the file and write it ASAP. IMO, creating and writing the config object near the top of the model makes the most sense, because you can use those config values instead of magic numbers for the rest of the code.

There are probably more "magic numbers", that can be turned into config vars as well.

Examples: df['is_train'] = np.random.uniform(0, 1, len(df)) <= .60 .60 is a magic number, move to config

RandomForestClassifier( most of the parameters in this classifier could also be turned into config vars

The purpose of config is to create a single place to "tweak config values" and end up with different results. A developer should never have to change the same magic number in two places.

Same is true for KNN classifier variable n_neighbors.

shabazpatel commented 6 years ago

@pennyfx True, configs should be created at the top followed by using the same throughout the notebook. That would decrease the chances of error by data scientist and help track non-default components used for the algorithm. I will update the same in the tutorial.