WGierke / git_better

3rd-placed solution for the informatiCup2017
https://git-better.herokuapp.com/
Apache License 2.0
5 stars 3 forks source link

Ensemble Numeric Classifiers #43

Closed WGierke closed 7 years ago

WGierke commented 7 years ago

refactored classifiers such that using them in a VotingClassifier works now. Current output:

$python app/main.py -i data/example-input.txt -t data/processed_data.csv -p
>>> 
Score on Test set: 0.989130434783
Score on Validation set: 0.129032258065
Score on Additional Validation set: 0.132653061224
Prediction for input: ['WEB' 'WEB' 'HW' 'WEB' 'WEB' 'WEB' 'WEB']
Baschdl commented 7 years ago

We could use only the best single classifier and don't ensemble all

WGierke commented 7 years ago

Solve #42

$python app/main.py -i data/example-input.txt -t data/processed_data.csv -p
>>> 
<class 'classifier.DescriptionClassifier'>
Score on Test: 0.869565217391
Score on Validation: 0.451612903226
Score on Additional Validation: 0.42
Prediction for input data:
['DEV' 'HW' 'EDU' 'WEB' 'WEB' 'DATA' 'DEV']
<class 'classifier.ReadmeClassifier'>
Score on Test: 0.83152173913
Score on Validation: 0.483870967742
Score on Additional Validation: 0.42
Prediction for input data:
['DEV' 'HW' 'EDU' 'DEV' 'WEB' 'HW' 'HW']
<class 'classifier.NumericEnsembleClassifier'>
Score on Test: 0.972826086957
Score on Validation: 0.129032258065
Score on Additional Validation: 0.133333333333
Prediction for input data:
['WEB' 'WEB' 'HW' 'WEB' 'WEB' 'WEB' 'WEB']
<class 'sklearn.ensemble.voting_classifier.VotingClassifier'>
Score on Test: 0.983695652174
Score on Validation: 0.322580645161
Score on Additional Validation: 0.44
Prediction for input data:
['DEV' 'HW' 'EDU' 'WEB' 'WEB' 'DATA' 'WEB']