Bug in CoxPH cpp code???

jeffwong commented 10 years ago

The CoxPH cpp code to calculate deviance does not agree with the equation listed in the vignette. The function

double CCoxPH::Deviance ( double adT, double adDelta, double adOffset, double adWeight, double *adF, unsigned long cLength, int cIdxOff ) { unsigned long i=0; double dL = 0.0; double dF = 0.0; double dW = 0.0; double dTotalAtRisk = 0.0;

dTotalAtRisk = 0.0;
for(i=cIdxOff; i<cLength+cIdxOff; i++)
{
    dF = adF[i] + ((adOffset==NULL) ? 0.0 : adOffset[i]);
    dTotalAtRisk += adWeight[i]*exp(dF);
    if(adDelta[i]==1.0)
    {
        dL += adWeight[i]*(dF - log(dTotalAtRisk));
        dW += adWeight[i];
    }
}

return -2*dL/dW;

}

has two key values, dL and dW. dL is calculating

\sum (w_i \delta_i (f(x_i) - \log R_i)

and dW is calculating

\sum w_i \delta_i

If we expand the formula for deviance in the vignette we would get

-2 \sum (w_i \delta_i [ f(x_i) - \log R_i + \log w_i ] )

the calculation for dL is valid but dW seems to be wrong. We would want dw to be calculating

\sum (w_i \log w_i)

and the final answer should combine the values dL and dW via

-2 * (dL + dW)

harrysouthworth commented 10 years ago

Thanks for this. Much appreciated. I don't know when I'll find time to dig into it, but thanks very much for reporting it.

Harry

On 15 June 2014 06:38, Jeffrey Wong notifications@github.com wrote:

The CoxPH cpp code to calculate deviance does not agree with the equation listed in the vignette. The function

double CCoxPH::Deviance ( double adT, double adDelta, double adOffset, double adWeight, double *adF, unsigned long cLength, int cIdxOff ) { unsigned long i=0; double dL = 0.0; double dF = 0.0; double dW = 0.0; double dTotalAtRisk = 0.0;

dTotalAtRisk = 0.0; for(i=cIdxOff; i<cLength+cIdxOff; i++) { dF = adF[i] + ((adOffset==NULL) ? 0.0 : adOffset[i]); dTotalAtRisk += adWeight[i]exp(dF); if(adDelta[i]==1.0) { dL += adWeight[i](dF - log(dTotalAtRisk)); dW += adWeight[i]; } }

return -2*dL/dW;

}

has two key values, dL and dW. dL is calculating

\sum (w_i \delta_i (f(x_i) - \log R_i)

and dW is calculating

\sum w_i \delta_i

If we expand the formula for deviance in the vignette we would get

-2 \sum (w_i \delta_i [ f(x_i) - \log R_i + \log w_i ] )

the calculation for dL is valid but dW seems to be wrong. We would want dw to be calculating

\sum (w_i \log w_i)

and the final answer should combine the values dL and dW via

-2 * (dL + dW)

— Reply to this email directly or view it on GitHub https://github.com/harrysouthworth/gbm/issues/22.

harrysouthworth commented 9 years ago

This issue was moved to gbm-developers/gbm#10

harrysouthworth / gbm

Bug in CoxPH cpp code??? #22