Closed jeffwong closed 9 years ago
Thanks for this. Much appreciated. I don't know when I'll find time to dig into it, but thanks very much for reporting it.
Harry
On 15 June 2014 06:38, Jeffrey Wong notifications@github.com wrote:
The CoxPH cpp code to calculate deviance does not agree with the equation listed in the vignette. The function
double CCoxPH::Deviance ( double adT, double adDelta, double adOffset, double adWeight, double *adF, unsigned long cLength, int cIdxOff ) { unsigned long i=0; double dL = 0.0; double dF = 0.0; double dW = 0.0; double dTotalAtRisk = 0.0;
dTotalAtRisk = 0.0; for(i=cIdxOff; i<cLength+cIdxOff; i++) { dF = adF[i] + ((adOffset==NULL) ? 0.0 : adOffset[i]); dTotalAtRisk += adWeight[i]exp(dF); if(adDelta[i]==1.0) { dL += adWeight[i](dF - log(dTotalAtRisk)); dW += adWeight[i]; } }
return -2*dL/dW;
}
has two key values, dL and dW. dL is calculating
\sum (w_i \delta_i (f(x_i) - \log R_i)
and dW is calculating
\sum w_i \delta_i
If we expand the formula for deviance in the vignette we would get
-2 \sum (w_i \delta_i [ f(x_i) - \log R_i + \log w_i ] )
the calculation for dL is valid but dW seems to be wrong. We would want dw to be calculating
\sum (w_i \log w_i)
and the final answer should combine the values dL and dW via
-2 * (dL + dW)
— Reply to this email directly or view it on GitHub https://github.com/harrysouthworth/gbm/issues/22.
This issue was moved to gbm-developers/gbm#10
The CoxPH cpp code to calculate deviance does not agree with the equation listed in the vignette. The function
double CCoxPH::Deviance ( double adT, double adDelta, double adOffset, double adWeight, double *adF, unsigned long cLength, int cIdxOff ) { unsigned long i=0; double dL = 0.0; double dF = 0.0; double dW = 0.0; double dTotalAtRisk = 0.0;
}
has two key values, dL and dW. dL is calculating
\sum (w_i \delta_i (f(x_i) - \log R_i)
and dW is calculating
\sum w_i \delta_i
If we expand the formula for deviance in the vignette we would get
-2 \sum (w_i \delta_i [ f(x_i) - \log R_i + \log w_i ] )
the calculation for dL is valid but dW seems to be wrong. We would want dw to be calculating
\sum (w_i \log w_i)
and the final answer should combine the values dL and dW via
-2 * (dL + dW)