pneuvial / c3co

Inferring cancer cell clonality from copy-number data
5 stars 1 forks source link

fitC3co(): Use a good old for loop instead of while loop? #28

Closed HenrikBengtsson closed 6 years ago

HenrikBengtsson commented 6 years ago

In fitC3co() there's the following while loop:

    it <- 1
    pp <- nb.arch[it]
    cond <- FALSE  ## condition for (early) stopping
    fitList <- allRes <- allLoss <- list()
    bestConfigp <- allConfig <- NULL
    while (!cond) {
        if (verbose) mprintf("Number of latent features: %d\n", pp)

        ## Initialization
        bestRes <- NULL
        bestBIC <- -Inf
        bestConfig <- aConf <- NULL
        ## Z0 are initialized with the centered version of the data
        Z0 <- initializeZ(Yc$Y1, Yc$Y2, p=pp, ...)
        if (verbose) {
            mprintf("Parameter configuration: (%s)\n",
                    comma(colnames(configs)))
        }
        allRes[[pp]] <- list()
        allLoss[[pp]] <- list()
        for (cc in seq_len(nrow(configs))) {
            cfg <- configs[cc, , drop = TRUE]
            if (verbose) {
              mprintf(" - configuration #%d (%s) of %d: ",
                  cc, 
                  comma(sprintf("%s = %g", names(cfg), cfg)),
                  nrow(configs))
            }

            ## THIS is the function that costs comme computation 
            if (verbose) mprintf("fused lasso")
            if (is.null(Y2)) cfg <- cfg["lambda1"]  ## Should this be an error?
            res <- positiveFusedLasso(Y = Y, Z = Z0, lambda = cfg)

            if (verbose) mprintf(", BIC = ")
            stats <- modelFitStatistics(Reduce(`+`, Y), res@E$Y, res@W, res@S$Z)
            BIC <- stats[["BIC"]]
            if (verbose) mprintf("%g", BIC)
            aConf <- c(pp, cfg, stats[["PVE"]], BIC, stats[["logLik"]], stats[["loss"]])
            ## replace above line by: aConf <- stats
            allConfig <- rbind(allConfig, aConf)
            allRes[[pp]][[cc]] <- res
            allLoss[[pp]][[cc]] <- stats[["loss"]]

            if (BIC > bestBIC) { ## BIC has improved: update best model
                bestRes <- res
                bestBIC <- BIC
                bestConfig <- aConf
                if (verbose) mprintf("*")
            }
            if (verbose) mprintf("\n")
        }

        fitList[[it]] <- bestRes
        bestConfigp <- rbind(bestConfigp, bestConfig)
        ## sanity check: minor CN < major CN in the best parameter
        ## configurations (not for all configs by default)
        if (!is.null(Y2) && warn) {
            Z <- slot(bestRes, "S")
            dZ <- Z$Z2 - Z$Z1
            tol <- 1e-2  ## arbitrary tolerance...
            if (min(dZ) < -tol) {
                warning("For model with ", pp, " features, some components in minor latent profiles are larger than matched components in major latent profiles")
            }
        }

        it <- it + 1
        pp <- nb.arch[it]
        ## stop if pp has reached the max of its grid
        cond <- is.na(pp)
    }

The only way I can see this while loop working is that it loops of all values of nb.arch; there is no early stopping or similar. Is this a remnant from old days? Something like:

for (it in seq_along(nb.arch)) {
  pp <- nb.arch[it]
  ...
}
HenrikBengtsson commented 6 years ago

I went ahead a changed this. Package tests do not complain.