Open aronwc opened 10 years ago
I got preliminary results that reveal strange datas :
1 - With an error of 29.9 : @SaraLeeDesserts , Excel sheet : 71.9% Male (Strange, after a quick look on twitter it seems that it's more 71.9% Female) 2 - With an error of 29.1 : @AshworthGolf , Excel sheet : 71.4% Male (It seems correct or maybe even more Male) 3 - With an error of 23.7 : @mitsucars , Excel sheet : 34.4 % Male (Same than the first one, I feel more that it's 34.4 Female)
In the mean time, I don't understand how to do that : report top 5 features weighted by (abs(feature value * model coefficient)) not sure what is feature value VS model coef
If C is the number of columns (i.e., features / account names), then clf.coef_ is the coefficients of the model (a C-length array).
Each row of Xd represents one company. It is also a C-length array.
By doing the element-wise multiplication of clf.coef_ and the ith row of Xd, we see the largest influence for a prediction.
E.g., coef_ = [5, -2, 6] ; row = [1, 2, 3] ; coef * row = [5, -4, 18]. Element three has the largest impact.
On Thu, Apr 24, 2014 at 9:40 AM, Cyril Trosset notifications@github.comwrote:
In the mean time, I don't understand how to do that : report top 5 features weighted by (abs(feature value * model coefficient)) not sure what is feature value VS model coef
— Reply to this email directly or view it on GitHubhttps://github.com/tapilab/ctrosset/issues/20#issuecomment-41287943 .
Include results of error analysis in report.