Open gaelfortin opened 4 years ago
Gerstung value computed is too high to what is eexpected according to Raphaël's data (patient + HSCT known prediction). Computing directly Gerstung's score with a line from Raphaël's data also lead to the issue (outside of Shiny app) which means that this issue is not linked to how data are inputed and transformed by the Shiny App
I will try to reproduce this error and investigate it. I'll let you know.
Great thanks ! You can investigate this issue using the issue#9_fix
branch.
Thanks to issue#9_fix
branch I managed to reproduce the error and I get the same results.
Maybe we should ask Raphael for his sessionInfo() output to make sure it's not a package update error?
I compared the f_calculation function of tests.Rmd and raphael_code.R with the diff
tool on the terminal and found no difference, which is consistent with your previous observation:
Computing directly Gerstung's score with a line from Raphaël's data also lead to the issue (outside of Shiny app) which means that this issue is not linked to how data are inputed and transformed by the Shiny App
code_raphael.txt code_test_md.txt
Just some commented lines were removed.
(base) pam@LAPTOP-353RDBOQ:/mnt/d$ diff code_raphael.txt code_test_md.txt
172,208d171
< ## Confidence intervals
< # osLoDiag <- osUpDiag <- rep(NA, length(osDiag))
<
< # if("analytical" %in% input$ciType){
< # PlogP2 <- function(x) {(x * log(x))^2}
< # errOsCr <- kmNrd$r[,2] * PlogP2(kmNrd$inc) * (1-(1-kmRel$inc) * (1-kmPrs$inc))^2 + kmRel$r[,2] * PlogP2(kmRel$inc) * (1-kmPrs$inc)^2* kmNrd$inc^2 + kmPrs$r[,2] * PlogP2(kmPrs$inc) * (1-kmRel$inc)^2* kmNrd$inc^2
< # errOsCr <- sqrt(errOsCr / PlogP2(osCr))
< # osUpCr <- osCr ^ exp(2* errOsCr)
< # osLoCr <- osCr ^ exp(-2*errOsCr)
< #segments(z, osLo[z+1] ,z,osUp[z+1], col=1, lwd=2)
< # }
< # if("simulated" %in% input$ciType){
< # Simulate CI
< # nSim <- 200
< # osCrMc <- sapply(1:nSim, function(i){
< # r <- exp(rnorm(5,0,sqrt(c(kmRel$r[,2],kmNrd$r[,2],kmPrs$r[,2], kmNcd$r[,2], kmCr$r[,2]))))
< # nrsCr <- .crAdjust(kmNrd$inc^r[2], kmRel$inc^r[1], time=x) ## Correct KM estimate for competing risk
< # diffCir <- diff(kmRel$inc^r[1]) * kmNrd$inc[-1]^r[2] ## Correct KM estimate for competing risk
< # rsCr <- computeHierarchicalSurvival(x = x, diffS0 = diffCir, S1Static = prsP, haz1TimeDep = tdPrmBaseline * exp(kmPrs$r[,1]-kmPrs$r0+log(r[3])))
< # osCr <- 1-(1-nrsCr)-(1-rsCr)
<
< # cr <- .crAdjust(kmCr$inc^r[5], kmNcd$inc^r[4], time=x)
< # ncd <- .crAdjust(kmNcd$inc^r[4], kmCr$inc^r[5], time=x)
< # osDiag <- computeHierarchicalSurvival(x = x, diffS0 = diff(cr), S1Static = osCr, haz1TimeDep = tdOsBaseline)
<
< # return(cbind(osCr, osDiag - (1-ncd)))
< # }, simplify="array")
< # osCrMcQ <- apply(osCrMc,1:2,quantile, c(0.025,0.975))
< # osLoCr <- osCrMcQ[1,,1]
< # osUpCr <- osCrMcQ[2,,1]
< # osLoDiag <- osCrMcQ[1,,2]
< # osUpDiag <- osCrMcQ[2,,2]
< # }
<
< #absolutePredictions=data.frame(x=x, cr=cr, ncd=ncd, osDiag=osDiag, nrsDiag=nrsDiag, rsDiag=rsDiag, relDiag=relDiag, osCr=osCr, nrsCr=nrsCr, relCr=relCr, rsCr=rsCr, osUpCr=osUpCr, osLoCr=osLoCr, osLoDiag=osLoDiag, osUpDiag=osUpDiag)
<
<
211a175
>
213d176
< #return(absolutePredictions)
Yes that is a good idea, ask for its sessionInfo
. Also you should try to compare the output of each command to see if/when a difference appears.
Also try to look at diff in function between Raphaël's code and Gerstung code in their own Shiny app gerstung_server.R
.
In 24b3786566e08d74733d19191b8059283835a560 commit, I calculated the osDiag for all 656 patients. I observed that all predictions are wrong and, for the most part, higher than expected :
I assume Raphael uses exactly raphael_code.R
to compute osDiag from gerstung_score_estimates.txt
and HSCT from ALFA0702_test_shiny.tsv
. If this is the case and we find no significant difference between our app/rmd code and Raphael's code, we also need to make sure we use exactly the same version of donneesAML.txt
and multistage.RData
.
I re-download multistage.RData
from gerstung-lab/AML-multistage/ and ALFA0702_test_shiny.txt
in case my version was unintentionally modified but got the same results. Do you know where donneesAML.txt
comes from?
At the moment, I don't know what to think about it. I'm waiting for Raphael's session info and I think I'll ask him to confirm that he really used the script he sent us.
As none of the patients have the same osDiag as the one reported, I lack a "positive" control to compare the output of each command.
I also add debugging/issue9/results_prediction_shiny.tsv
which contains results used to create the plot above.
Great work! Sadly it confirms our previous observations. Raphaël told me that donneesAML.txt
are given by the Gerstung lab but I could not find the exact file. I guess that it is a file that merges files contained in the AML data folder of their github.
Rerun the analysis by copy-paste the original code that is in gerstung_server.R
and transfert it before te new code Raphaël made.
I compared the version of Raphael's sessionInfo()
packages with mine.
I found, among other things, a difference for the survival
package called when loading the coxHD
package.
Survival version:
Raphaël:
2.44-1.1
PA:3.1-12
After getting the same survival
version as him, I got the same osDiag
results as before. But while rereading the code I noticed that the calculation of osDiag
in the computeAbsoluteProbabilities()
function relies on precomputed objects like coxRFXNcdTD
or coxRFXCrTD
from multistage.RData
. Thus, even if the survival
and coxHD
versions are the same as for Raphael, it is possible that some of these objects are different? As I mentionned in a message earlier, I re-downloaded multistage.RData
from gerstung-lab/AML-multistage/ but I got the same results.
Can you confirm that the multistage.RData
file came directly from gerstung-lab/AML-multistage/? So it could be different from Raphael's one in some way? We don't have his version available? Maybe we could ask him?
Yes the object multistage.RData
comes directly from their github but Raphaël may have modified it. Try with its version of the object.
Thanks !
Even with his version I get the same results (osDiag
= 0.38 for obs
= 1003 for exemple).
Concerning donneesAML.txt
, when I subset data I get different results but nothing close to 0.19 in case of obs
1003. That the only way I found to modify osDiag
results for the moment.
But it seem unlikely that this file is different from what Raphaël used?
I will wait for Raphaël news to see if he get the same results that previously.
This file should be the one that Raphaël used. Did you compare the code from gerstung app and Raphaël's code?
I compared some portions of his code with gerstung app's code and I noticed some modifications allowing to give newdata
as input, for exemple:
Raphaël's code:
riskMissing <- function(newdata)
{
sapply(models, function(m){
fit <- get(paste("coxRFX",m,"TD", sep=""))
if(!is.null(fit$na.action))
fit$Z <- fit$Z[-fit$na.action,]
PredictRiskMissing(fit, newdata, var="var2")}, simplify = FALSE)
}
Gerstung's code:
riskMissing <- reactive(
{
sapply(models, function(m){
fit <- get(paste("coxRFX",m,"TD", sep=""))
if(!is.null(fit$na.action))
fit$Z <- fit$Z[-fit$na.action,]
PredictRiskMissing(fit, getData(), var="var2")}, simplify = FALSE)
})
The each time, the core of the algorithm seem to be the same. But something may have slipped my mind.
This should not be involved in the issue. If there is no significant difference I guess we will have to wait to see which code Raphaël ran and if he can replicate his different osDiag
results... Thanks for investigating the issue!
Gerstung score is sometimes outputted as 0 (
osDiag > 0.3
) while it should be reported as 1 thus allowing a HSC transplantation. Data may not be correctly understood by Gerstung and biais score.