MLblog / jads_kaggle

Contains our group's work in various kaggle competitions
MIT License
10 stars 23 forks source link

Organize starter code #125

Closed joepvdbogaert closed 5 years ago

joepvdbogaert commented 5 years ago

This PR organizes the ideas from my previous notebook in proper modules with documentation.

Organization I created two modules: engineering.py and modeling.py, which matches our current organization and allows our two subteams to work in separate documents. Together, the modules have all the required functions to reproduce the results from my notebook.

The exception is the progress function, which I put in common.utils. It can be used to print progress with timestamps and, if you want, on the same line as the previous print (which is useful for doing progress bar-like stuff).

I also added a notebook that shows in a more concise way how to use the implemented functions and classes.

New features I only added one small new feature to the code. You can now ask the cross validation function to predict on the test data on every fold. This way, we can perform blending without rewriting the whole cross validation function.