gertvv / gemtc

GeMTC R package: model generation for network meta-analysis
GNU General Public License v3.0
43 stars 25 forks source link

Mixing Hazard Ratios and Rates for survival analysis #35

Open joelkuiper opened 9 years ago

joelkuiper commented 9 years ago

In some (/most) oncological datasets survival rates are also (or only) given as Hazard Ratios (instead of rates or exposures currently supported in GeMTC) (e.g. progression-free survival in [2]). There has been some work on combining HR's and rate data in the same Network Meta Analysis [1]. I was asked to implement this functionality and am curious whether or not it makes sense to add the model of [1] to GeMTC. The BUGS code might need (considerable) refactoring, so I'm wondering if you think this is worthwhile.

[1] Network meta-analysis on the log-hazard scale, combining count and hazard ratio statistics accounting for multi-arm trials: A tutorial pmid

[2] http://www.ncbi.nlm.nih.gov/pubmed/24188685

gertvv commented 9 years ago

This should already be possible, just specify the (log) hazard ratios in the data.re table. Also see ?mtc.network. If there are multi-arm trials with hazard ratios only you will need to know the standard error of the log-hazard in the baseline arm, or you will have to approximate it by assuming a correlation.

The Woods paper is one of the ones I used as reference when implementing this, by the way.

joelkuiper commented 9 years ago

Hmm when I try to reproduce the results from [1] I get the following error

Compiling model graph
   Resolving undeclared variables
   Allocating nodes
Deleting model

Error in rjags::jags.model(file.model, data = syntax[["data"]], inits = syntax[["inits"]],  : 
  RUNTIME ERROR:
Cannot invert matrix: not positive definite

using the code below

data.ab <- read.table(textConnection('
study treatment responders  sampleSize
01  Salmeterol  1 229
01  Placebo 1 227
02  Fluticasone 4 374
02  Salmeterol  3 372
02  SFC 2 358
02  Placebo 7 361
03  Salmeterol  1 554
03  Placebo 2 270'), header=T)

# From table 2, multi-arm data from table 3
data.re <- read.table(textConnection('
study treatment diff  std.err
04  Placebo NA  NA
04  Fluticasone -0.276  0.203
05  Placebo NA 0.066
05  SFC -0.209  0.072
05  Salmeterol  -0.154  0.070
05  Fluticasone 0.055 0.063
'), header=T)

network <- mtc.network(data.ab=data.ab, data.re=data.re)
model <- mtc.model(network, link="cloglog", likelihood="binom")
gertvv commented 9 years ago

The standard error of the baseline log-hazard (0.066) in study 5 is greater than that of the log-hazard ratio of Fluticasone arm versus placebo (0.063), which is inconsistent.

joelkuiper commented 9 years ago

Hmm, must've misinterpreted the data from Table 3 of [1]. Would a more reasonable approach be to take the data as-is from Table 2 and write each baseline as a separate study; somehow imputing the baseline std.err for the multi-arm Placebo case?

gertvv commented 9 years ago

Since they give log-hazards, not log-hazard-ratios, the value for placebo is likely correct. I think you want their formula (9) to compute the SE of the LHR in the other arms. Having these values in the same column is kind of confusing, I will admit...

joelkuiper commented 9 years ago
data.ab <- read.table(textConnection('
study treatment responders  sampleSize
01  Salmeterol  1 229
01  Placebo 1 227
02  Fluticasone 4 374
02  Salmeterol  3 372
02  SFC 2 358
02  Placebo 7 361
03  Salmeterol  1 554
03  Placebo 2 270'), header=T)

data.re <- read.table(textConnection('
study treatment diff  std.err
04  Placebo NA  NA
04  Fluticasone -0.276  0.203
05  Placebo NA 0.066
05  SFC -0.209  0.098
05  Salmeterol  -0.154  0.096
05  Fluticasone 0.055 0.092
'), header=T)

network <- mtc.network(data.ab=data.ab, data.re=data.re)
model <- mtc.model(network, link="cloglog", likelihood="binom", linearModel="fixed")
mtc.run(model) -> results
forest(relative.effect(results, t1="Placebo"))

Something like this? I used the data from table 3 as the baseline SE of Placebo. For the other data I directly used the data from table 2. The results are very similar (although slighly off). Worthwhile to include these as a validation test?

Fixed Effect

rplot001

Random Effect

rplot001

joelkuiper commented 8 years ago

The docs specify "For trials with more than two arms, specify the standard error of the mean of the baseline arm in ‘std.err’, as this determines the covariance of the differences", could you provide an example of this? I'm a bit confused as to the mean of what exactly :wink: Free karma

Also it seems to be that the format as in table 2 of the the tutorial is most widely used. Would it make sense to use that format instead and/or provide a utility for converting it?

Btw the table as they pass it in (see appendix) does give an error!

data.re <- read.table(textConnection('
study treatment diff  std.err
04  Placebo NA  NA
04  Fluticasone -0.276  0.203
05  Placebo NA 0.066
05  SFC -0.209  0.072
05  Salmeterol  -0.154  0.070
05  Fluticasone 0.055 0.063
'), header=T)
joelkuiper commented 8 years ago

Note that the new version of GeMTC/JAGS gives different results for random effects (than the previous forest plot), not sure why. Identical code. Old one is closer to the published results.

JAGS 3.40, GeMTC 0.6-2, rjags_3-15

rplot001

JAGS 4.0.1, GeMTC 0.7-1, rjags_4-4

rplot001

gertvv commented 8 years ago

I'm not sure why this would be the case but I suspect it has to do with priors. Could you try with an explicitly set prior for the heterogeneity parameter in both cases? There have also been some changes to how the likelihood is implemented, but those shouldn't effect contrast-based data (which I think should be the only kind of data you have?).

joelkuiper commented 8 years ago

Same difference with

model <- mtc.model(network,
                   link="cloglog",
                   likelihood="binom",
                   linearModel="random",
                   hy.prior=mtc.hy.prior("std.dev", "dunif", 0, "om.scale"),
                   om.scale=1.230952)

Note that I'm using both arm based and contrast based data (data.ab and data.re), not sure if that's related! The other obvious suspect is of course the new JAGS version, but it'd be weird if that made such a difference!

gertvv commented 8 years ago

Ah, in that case it is almost certainly due to the priors on the baseline arms for the arm-based data. Those are now less informative than the old N(0, (15 * om.scale)^2) priors and also are always compatible to the likelihood. For binom/cloglog it is a U(0,1) prior on the probability for the binomial. I got these from a paper that was referenced from the TSDs or the MDM series, but I don't recall now.

joelkuiper commented 8 years ago

Ah so is there a way to override these priors to verify?

gertvv commented 8 years ago

You could cat(model$code) to get the code for both and modify the priors for the mu[i] in either one to check. You will probably have to set the inits for mu or p.base to NULL for it to run.

joelkuiper commented 8 years ago

Sorry for the long code, didn't have time to write a shorter one (or somesuch)

library(gemtc)

convert <- function(re) {
    options(stringsAsFactors = FALSE)  # Because R
    results <- data.frame()
    studies <- unique(re$study)

    entry <- function(study, treatment, diff, std.err) {
        list(study = study, treatment = treatment, diff = diff, std.err = std.err)
    }

    as.entry <- function(row) {
        entry(row$study, row$treatment, row$diff, row$std.err)
    }

    for(studyId in studies) {
        data <- subset(re, study == studyId)
        if(nrow(data) == 1) {
            ## Two-arm study
            base <- entry(studyId, data$base, NA, NA)
            treatment <- as.entry(data[1, ])
            results <- do.call("rbind", list(results, base, treatment))

        } else {
            ## Multi-arm study
            ## These entries are similar, but require standard error of the mean for the baseline
            ## We try to compute this if possible, if not we approximate it

            ## Guess at most common base, which is the one most referred to
            base.table <- table(data$base)
            base.trt <- names(base.table)[[which.max(base.table)]]

            treatments <- subset(data, base == base.trt)

            calc.base.se <- function(d.ab.se, d.ac.se, d.bc.se) {
                ## Let $Var(D_{BC}) = Var(D_{AB}) + Var(D_{AC}) - 2Var(Y_A)$, thus
                ## $se_B = \sqrt{(se^2_{AB} + se^2_{CB} - se^2_{AC}) / 2}$
                sqrt((d.ab.se^2 + d.ac.se^2 - d.bc.se^2) / 2)
            }

            approx.active.comparison <- function(d.ab, d.ac) {
                ## Try to use approximate Var(D_bc) using number of participants, if available (formula 9, Brooks et.al.)
                ## assuming the standard error is proportional to 1/sqrt(n)

                stopifnot(d.ab$base.n == d.ac$base.n) ## A is base should have same number of participants
                n.a <- d.ab$base.n
                n.b <- d.ab$treatment.n
                n.c <- d.ac$treatment.n

                if(all(!is.na(c(n.a, n.b, n.c)))) {
                    sqrt(((d.ab$std.err^2 + d.ac$std.err^2) * (1/n.b + 1/n.c)) / (1/n.b + 1/n.c + 2/n.a))
                } else {
                    NA
                }
            }

            base.se <- function(b, trt1, trt2) {
                d.ab <- subset(data, base == b & treatment == trt1)
                d.ac <- subset(data, base == b & treatment == trt2)
                d.bc <- subset(data, base == trt2 & treatment == trt1)

                se <- NA
                if(nrow(d.bc) != 0) {
                    ##  We have enough data to compute baseline se
                    se <- calc.base.se(d.ab$std.err, d.ac$std.err, d.bc$std.err)
                } else {
                    ## We need to approximate D_bc
                    d.bc.std.err <- approx.active.comparison(d.ab, d.ac)
                    if(!is.na(d.bc.std.err)) {
                        se <- calc.base.se(d.ab$std.err, d.ac$std.err, d.bc.std.err)
                    }
                }
                se
            }

            ## Find the combinations of treatments that we could use (e.g. BC, BD)
            combinations <- t(combn(as.character(treatments$treatment), 2))

            ## Try methods by Brooks et.al. to computer or approximate baseline se,
            ## We take the median of all these values, NAs ommited
            se <- median(apply(combinations, 1, function(row) { base.se(base.trt, row[[1]], row[[2]]) }), na.rm=T)

            ## Append the treatments associated with the base
            if(is.finite(se)) { # not NaN, NA or infinite
                results <- rbind(results, entry(studyId, base.trt, NA, se))
                for(entry in by(treatments, 1:nrow(treatments), as.entry)) {
                    results <- rbind(results, entry)
                }
            } else {
                warning(paste("could not calculate baseline std.err from data for study", studyId, "omitting"))
            }
        }
    }

    options(stringsAsFactors = TRUE) # Because R, but lets not break compatibility
    results
}

data.ab <- read.table(textConnection('
study treatment responders  sampleSize
01  Salmeterol  1 229
01  Placebo 1 227
02  Fluticasone 4 374
02  Salmeterol  3 372
02  SFC 2 358
02  Placebo 7 361
03  Salmeterol  1 554
03  Placebo 2 270'), header=T)

re <- read.table(textConnection('
study        base   treatment   diff std.err base.n treatment.n
    4     Placebo Fluticasone -0.267   0.203     NA          NA
    5     Placebo         SFC -0.209   0.098   1524        1533
    5     Placebo  Salmeterol -0.154   0.096   1524        1521
    5     Placebo Fluticasone  0.055   0.092   1524        1534
    5  Salmeterol         SFC -0.056   0.100   1521        1533
    5 Fluticasone         SFC -0.264   0.096   1534        1533
'), header=T)

data.re <- convert(re)
data.re

network <- mtc.network(data.ab=data.ab, data.re=data.re)
model <- mtc.model(network,
                   link="cloglog",
                   likelihood="binom",
                   linearModel="fixed",
                   hy.prior=mtc.hy.prior("std.dev", "dunif", 0, "om.scale"))
mtc.run(model) -> results
forest(relative.effect(results, t1="Placebo"))

Given the format in re, attempt to transform it to the format used in data.re. Now for the big question: I assume this also works for and OR, RR, RMD?

gertvv commented 8 years ago

Your calc.base.se is definitely generic. I'm not 100% sure about your approximate method but that also doesn't look like its specific to the scale. So yes, it should work for other scales as well.

joelkuiper commented 8 years ago

Good, thanks for the sanity check! Might be worthwhile to include in the package (with some modifications, maybe)?

Gist for completeness

Drparvez commented 7 years ago

Hello Gert,

I am a Physician working in oncology and I am trying to learn Bayesian method of meta-analysis. I am not sure if you are the person to ask this question. If not please, let me know who would be the right person, if you know such person.

I run in to this problem: Using log hazard scale for studies with 0 cells seems to give discordant results depending on which arm is 0. I would expect one to be reciprocal of the other.

That is if Treatment1 0/15 and Placebo 2/13 gave a hazard ratio of "h" then, I would expect for : Placebo 0/15 and Treatment 2/13 to give hazard ratio "1/h"

And if I put these 2 studies only in the meta-analysis, I would expect to get hazard ratio of "1"

But....

data.ab <- read.table(textConnection(' study treatment responders sampleSize 01 Treatment1 0 15 01 Placebo 2 13 02 Treatment1 2 13 02 Placebo 0 15 '), header=T) network <- mtc.network(data.ab=data.ab) model <- mtc.model(network, link="cloglog", likelihood="binom", linearModel="fixed", hy.prior=mtc.hy.prior("std.dev", "dunif", 0, "om.scale")) mtc.run(model) -> result summary(result) print(result$deviance) forest(relative.effect(result, t1="Placebo"))

Gives me 0.49 (0.059, 2.7)

And If I do only the 1st study: data.ab <- read.table(textConnection(' study treatment responders sampleSize 01 Treatment1 0 15 01 Placebo 2 13 '), header=T)

I get 2.8e-9 (1.1e-27, 0.12)

And if I reverse the data: data.ab <- read.table(textConnection(' study treatment responders sampleSize 02 Treatment1 2 13 02 Placebo 0 15 '), header=T) I get 3.2 (0.24, 89.0)

Can you tell me what I am doing wrong?

If not do I need use one lnHR or the the other? Or average the two lnHR and the SE[lnHR] to get an better estimate?

Thank you very much!

DivyeshThakker commented 4 years ago

Hi

for mtc.anohe, how can i specify reference treatment? because while plotting forest plot, i am able to specify reference treatment with the use of relative.effect(..., t1="Placebo") but not in the mtc.anohe and so it's giving output by taking different reference treatment. i would appreciate ur help.