WGierke / git_better

3rd-placed solution for the informatiCup2017
https://git-better.herokuapp.com/
Apache License 2.0
5 stars 3 forks source link

Challenge & Notes #2

Open WGierke opened 8 years ago

WGierke commented 8 years ago

Challenge

Documentation Structure

  1. Data Exploration and Prediction Model
    • analyze and document relevant features
    • document how to avoid overfitting
    • explain why we've decided to use the features
    • explain how we've developed the prediction model
  2. Automated Classification
    • implement the app that takes the input format and creates the output format
    • either 1) prompt for the training data to use or 2) directly include the learned model
  3. Validation
    • validate with Appendix B
    • create a boolean matrix with our estimated label and the predicted one
    • compute recall per category
    • compute precision per category
    • dicuss quality of results and whether higher yield or higher precision is more important
  4. Extension
    • use the model for a nice app
  5. Furthermore
    • document 3 repos where we think our model will yield better results
    • install and user manual
    • document decisions we made for features, algorithms, data structures, software development tools and practices

      Notes

Examples for DATA-Repositories openaddresses / openaddresses unitedstates / congress-legislators OpenExoplanetCatalogue / open_exoplanet_catalogue Chicago / food-inspections-evaluation GSA / data cernopendata / opendata.cern.ch benbalter / congressional-districts

Extension

"Improve yourself"

Sources: