Open bigfatnoob opened 8 years ago
rank , name , med , iqr
----------------------------------------------------
1 , COCOMO2 , 42 , 86 (* | ), 0.13, 0.42, 0.99
1 , CART , 47 , 286 (*--- | ), 0.16, 0.47, 3.02
1 , COCONUT , 76 , 113 (-* | ), 0.19, 0.76, 1.32
1 , P_BASELINE , 88 , 59 (-* | ), 0.45, 0.88, 1.04
1 , BASELINE , 102 , 222 (-*- | ), 0.38, 1.02, 2.60
rank , name , med , iqr
----------------------------------------------------
1 , COCOMO2 , 8 , 224 (* | ), 0.01, 0.08, 2.25
1 , CART , 11 , 96 (* | ), 0.00, 0.11, 0.96
1 , COCONUT , 11 , 113 (* | ), 0.01, 0.11, 1.14
1 , P_BASELINE , 94 , 283 (* | ), 0.16, 0.94, 2.99
2 , BASELINE , 503 , 6223 (* | ), 0.58, 5.03, 62.81
rank , name , med , iqr
----------------------------------------------------
1 , COCONUT , 10 , 88 (* | ), 0.00, 0.10, 0.88
1 , COCOMO2 , 11 , 88 (* | ), 0.00, 0.11, 0.88
2 , CART , 32 , 140 (* | ), 0.02, 0.32, 1.42
2 , P_BASELINE , 60 , 128 (* | ), 0.27, 0.60, 1.55
2 , BASELINE , 77 , 277 (* | ), 0.15, 0.77, 2.92
rank , name , med , iqr
----------------------------------------------------
1 , COCOMO2 , 1 , 37 (* | ), 0.00, 0.01, 0.37
1 , COCONUT , 1 , 37 (* | ), 0.00, 0.01, 0.37
1 , CART , 34 , 111 (* | ), 0.01, 0.34, 1.12
1 , P_BASELINE , 100 , 121 (* | ), 0.46, 1.00, 1.67
1 , BASELINE , 255 , 1591 (* | ), 0.57, 2.55, 16.48
rank , name , med , iqr
----------------------------------------------------
1 , TEAK , 23 , 59 (* | ), 0.07, 0.23, 0.66
1 , CART , 43 , 69 (-* | ), 0.11, 0.43, 0.80
1 , P_BASELINE , 55 , 147 (-*-- | ), 0.29, 0.55, 1.76
1 , BASELINE , 61 , 199 (-*--- | ), 0.18, 0.61, 2.17
rank , name , med , iqr
----------------------------------------------------
1 , CART , 21 , 64 (* | ), 0.03, 0.21, 0.67
1 , TEAK , 37 , 115 (* | ), 0.08, 0.37, 1.23
2 , P_BASELINE , 180 , 436 (* | ), 0.39, 1.80, 4.75
2 , BASELINE , 180 , 453 (* | ), 0.40, 1.80, 4.93
rank , name , med , iqr
----------------------------------------------------
1 , CART , 54 , 92 (* | ), 0.06, 0.54, 0.98
1 , TEAK , 82 , 110 (* | ), 0.22, 0.82, 1.32
1 , P_BASELINE , 99 , 176 (* | ), 0.31, 0.99, 2.07
2 , BASELINE , 966 , 2676 (* | ), 2.24, 9.66, 29.00
rank , name , med , iqr
----------------------------------------------------
1 , TEAK , 21 , 93 (* | ), 0.05, 0.21, 0.98
1 , CART , 24 , 81 (* | ), 0.08, 0.24, 0.89
1 , BASELINE , 45 , 122 (* | ), 0.19, 0.45, 1.41
1 , P_BASELINE , 51 , 178 (* | ), 0.15, 0.51, 1.93
rank , name , med , iqr
----------------------------------------------------
1 , CART , 7 , 44 (* | ), 0.01, 0.07, 0.45
1 , TEAK , 19 , 86 (* | ), 0.02, 0.19, 0.88
2 , P_BASELINE , 37 , 72 (* | ), 0.12, 0.37, 0.84
2 , BASELINE , 37 , 97 (* | ), 0.12, 0.37, 1.09
so we seem to be suggesting linear regression is NOT a good baseline but either decision trees (with CART) or something like TEAK is
BTW, you sure on the coc81 results? a median error of 1% never seen that before
Linear Regression is not a good baseline for effort estimation so a slimmer version of it by pruning out attributes that are not necessary is what i think we should propose. The correlation between the attribute and effort is estimated through spearman's correlation and only the top X ranked attributes are selected where X = rows/10
. CART and TEAK cannot be called baselines since
1) they are not easy to implement,
2) needs tuning for setting magic params
The error measure used is the one asked by the EMSE reviewers. For our standard MRE measure it is around 42
right... but what do you think about the ACM TOSEM baseline paper? is that what you'd call a baseline hat everyone should use to comapre their work against?
In the above results, ACM TOSEM baseline is highlighted as BASELINE
but our proposed baseline is shown as P_BASELINE
. We can see that the BASELINE is way off from other methods and cannot be compared to the other methods. But our method almost is always in the same league as the other methods for both Cocomo and Non-Cocomo data. So i think, our proposed baseline is better than the that proposed by Whigham et. al in the ACM TOSEM paper and at the same time it is simple to incorporate.
So i think, our proposed baseline is better than the that proposed by Whigham et. al in the ACM TOSEM paper and at the same time it is simple to incorporate.
sounds like a paper to me. first "dialog strategies". then "better baseline"
Whigham et Al suggests using linear regression as a baseline but for software effort datasets the number of attributes are large in number which makes the datasets really skewed for prediction. Here, I have removed attributes(columns) with low correlation with the dependent variable such that