jupyterday-atlanta-2016 / proposals

Open discussions about conference proposals.
1 stars 0 forks source link

Fit Your Learning Curves For Fun and Profit #10

Open NickleDave opened 8 years ago

NickleDave commented 8 years ago

Me

David Nicholson


Abstract

Scientists that study machine learning often plot the error of a model against the amount of data used to train that model. Such plots are known as learning curves or validation curves. In 1994, Cortes et al. proposed a method for fitting these curves with an exponential decay function. Their method provides a way to predict how different models stack up against each other. Importantly, it can avoid the computationally expensive process of estimating error for large training sets. With help from a Jupyter notebook, I will introduce exponential decay functions and give a brief derivation of Cortes et al.'s method. Then I will demonstrate how to fit learning curves with their model, using the data sets built into the Sci-Kit Learn library. I will also demonstrate some less-than-ideal fits using my own (lovely) data. Lastly I will discuss how it might be possible to detect statistically significant differences between models using the fit parameters. (Step 3: profit). I expect the talk to be about 20 minutes.


Affiliation

Emory University

About Me

www.nicholdav.info

NickleDave commented 8 years ago

@bollwyvl @tonyfast I realize I'm late to the party ... let me know if you think this is something that could fit in to what you've got scheduled. Thanks

tonyfast commented 8 years ago

Alright, we've got you in the science workshop... we'll be putting some more structure around it (repos, chat, website, etc). soon! Thanks!