Closed WGierke closed 7 years ago
We could use only the best single classifier and don't ensemble all
Solve #42
$python app/main.py -i data/example-input.txt -t data/processed_data.csv -p
>>>
<class 'classifier.DescriptionClassifier'>
Score on Test: 0.869565217391
Score on Validation: 0.451612903226
Score on Additional Validation: 0.42
Prediction for input data:
['DEV' 'HW' 'EDU' 'WEB' 'WEB' 'DATA' 'DEV']
<class 'classifier.ReadmeClassifier'>
Score on Test: 0.83152173913
Score on Validation: 0.483870967742
Score on Additional Validation: 0.42
Prediction for input data:
['DEV' 'HW' 'EDU' 'DEV' 'WEB' 'HW' 'HW']
<class 'classifier.NumericEnsembleClassifier'>
Score on Test: 0.972826086957
Score on Validation: 0.129032258065
Score on Additional Validation: 0.133333333333
Prediction for input data:
['WEB' 'WEB' 'HW' 'WEB' 'WEB' 'WEB' 'WEB']
<class 'sklearn.ensemble.voting_classifier.VotingClassifier'>
Score on Test: 0.983695652174
Score on Validation: 0.322580645161
Score on Additional Validation: 0.44
Prediction for input data:
['DEV' 'HW' 'EDU' 'WEB' 'WEB' 'DATA' 'WEB']
refactored classifiers such that using them in a VotingClassifier works now. Current output: