Closed ipeirotis closed 12 years ago
I still need to fix some corner cases, and extend to Workers, but here is what we currently have (pushed in git):
Categories: 2 Objects in Data Set: 5 Workers in Data Set: 5 Labels Assigned by Workers: 25 [DS_Pr[porn]] Average DS estimate for prior probability of category porn: 0.6 [DS_Pr[notporn]] Average DS estimate for prior probability of category notporn: 0.4 [MV_Pr[porn]] Average Majority Vote estimate for prior probability of category porn: 0.6 [MV_Pr[notporn]] Average Majority Vote estimate for prior probability of category notporn: 0.4 [DS_Exp_Cost] Average Expected misclassification cost (for EM algorithm): 0.0 [MV_Exp_Cost] Average Expected misclassification cost (for Majority Voting algorithm): 0.41600000000000004 [NoVote_Opt_Cost] Average Expected misclassification cost (random classification): 0.5 [DS_Opt_Cost] Average Minimized misclassification cost (for EM algorithm): 0.0 [MV_Opt_Cost] Average Minimized misclassification cost (for Majority Voting algorithm): 0.31999999999999995 [NoVote_Opt_Cost] Average Minimized misclassification cost (random classification): 0.5 [Eval_Cost_MV_ML] Average Classification cost for naïve single-class classification, using majority voting (evaluation data): NaN [Eval_Cost_DS_ML] Average Classification cost for single-class classification, using EM (evaluation data): NaN [Eval_Cost_MV_Soft] Average Classification cost for naïve soft-label classification (evaluation data): NaN [Eval_Cost_DS_Soft] Average Classification cost for soft-label classification, using EM (evaluation data): NaN
We may want to remove the word "Average" from the description. Now that I see it, it seems superfluous.
After a change:
[DataQuality_Eval_Cost_MV_ML] Data quality, naive majority voting algorithm: 0.02720691072210519 [DataQuality_Eval_Cost_MV_Soft] Data quality, naive soft label: -0.055352502775821176
See the related commit in Github in order to figure out which format would be best, and let me know. Otherwise, simply close it :)
Overall Statistics
Categories: 2 ==> OK Objects in Data Set: 1000 ==> OK Workers in Data Set: 83 ==> OK Labels Assigned by Workers: 5000 ==> OK
Data Statistics
Average[Object]: 500.5 ==> REMOVE Average[DS_Pr[1]]: 0.313 ==> DS estimate for prior probability of category [1] Average[DS_Pr[0]]: 0.687 ==> DS estimate for prior probability of category [0] Average[DS_Category]: 0.297 ==> REMOVE Average[MV_Pr[1]]: 0.281 ==> DS estimate for prior for category [1] Average[MV_Pr[0]]: 0.719 ==> Majority Vote estimate for prior probability of category [0] Average[MV_Category]: 0.261 ==> Majority Vote estimate for prior probability of category [1] Average[DS_Exp_Cost]: 0.095 ==> Expected misclassification cost (for EM algorithm) Average[MV_Exp_Cost]: 0.192 ==> Expected misclassification cost (for Majority Voting algorithm) Average[NoVote_Exp_Cost]: 0.43 ==> Expected misclassification cost (random classification) Average[DS_Opt_Cost]: 0.064 ==> Minimized misclassification cost (for EM algorithm) Average[MV_Opt_Cost]: 0.139 ==> Minimized misclassification cost (for Majority Voting algorithm) Average[NoVote_Opt_Cost]: 0.313 ==> Minimized misclassification cost (random classification) Average[Correct_Category]: 0.491 ==> REMOVE Average[Eval_Cost_MV_ML]: 0.304 ==> Classification cost for naïve single-class classification, using majority voting (evaluation data) Average[Eval_Cost_DS_ML]: 0.286 ==> Classification cost for single-class classification, using EM (evaluation data) Average[Eval_Cost_MV_Soft]: 0.33 ==> Classification cost for naïve soft-label classification (evaluation data) Average[Eval_Cost_DS_Soft]: 0.296 ==> Classification cost for soft-label classification, using EM (evaluation data)
Worker Statistics
Average[Est. Quality (Expected)]: 30.771% ==> Worker quality (expected_quality metric, EM algorithm estimates) Average[Est. Quality (Optimized)]: 33.361% ==> Worker quality (optimized_quality metric, EM algorithm estimates) Average[Eval. Quality (Expected)]: 25.152% ==> Worker quality (expected_quality metric, evaluation data) Average[Eval. Quality (Optimized)]: 29.439% ==> Worker quality (expected_quality metric, evaluation data) Average[Number of Annotations]: 60.241 ==> Average number of labels assigned per worker Average[Gold Tests]: 0.0 ==> Average number of gold test per worker