implement the app that takes the input format and creates the output format
either 1) prompt for the training data to use or 2) directly include the learned model
Validation
validate with Appendix B
create a boolean matrix with our estimated label and the predicted one
compute recall per category
compute precision per category
dicuss quality of results and whether higher yield or higher precision is more important
Extension
use the model for a nice app
Furthermore
document 3 repos where we think our model will yield better results
install and user manual
document decisions we made for features, algorithms, data structures, software development tools and practices
Notes
Examples for DATA-Repositories
openaddresses / openaddresses
unitedstates / congress-legislators
OpenExoplanetCatalogue / open_exoplanet_catalogue
Chicago / food-inspections-evaluation
GSA / data
cernopendata / opendata.cern.ch
benbalter / congressional-districts
Extension
"Improve yourself"
Login with Github
-> Stats of your own repos e.g. 30% Data, 70% Software
-> Stats of repos your friends recently starred
|-Data-| Software | Homework | ...|
-> Stats of trending repos
|-Data-| Software | Homework | ...|recently
Challenge
Documentation Structure
Notes
Examples for DATA-Repositories openaddresses / openaddresses unitedstates / congress-legislators OpenExoplanetCatalogue / open_exoplanet_catalogue Chicago / food-inspections-evaluation GSA / data cernopendata / opendata.cern.ch benbalter / congressional-districts
Extension
"Improve yourself"
Sources: