hansenlab / minfi

Devel repository for minfi
58 stars 68 forks source link

Problem with estimateCellCounts() #114

Open stephaniehicks opened 7 years ago

stephaniehicks commented 7 years ago

Not sure if this should be submitted as an issue, but thought it was worth mentioning in case anyone has the same problem. I was using estimateCellCounts() and came across this error:

> counts450K <- estimateCellCounts(rgset)
Loading required package: FlowSorted.Blood.450k
[estimateCellCounts] Combining user data with reference (flow sorted) data.

Error in as(rv, class(v)) : 
  no method or default for coercing “logical” to “factor”

After a bit of sleuthing I tracked it down to a problem in minfi:::.harmonizeDataFrames() inside the minfi::combine() function. It didn't like the fact that one of the columns in my colData() was a factor. Specifically this line:

   is.na(df.add[1,]) <- TRUE

I changed the column to a character:

> colData(rgset)$blahblah <- as.character(colData(rgset)$blahblah)

and then the problem with estimateCellCounts() was resolved.

kennylouie commented 7 years ago

Ran into the same problem while using the FlowSorted.CordBlood.450k reference. Adding onto what @stephaniehicks mentioned, the exact line within minfi:::.harmonizeDataFrames that messes up is on line 199 of utils.R:

is.na(df.add[1,]) <- TRUE

Seems like it doesn't like changing the reference set's y.only columns into NAs prior to coercing with the original x dataframe.

A potential patch would be to change the way the additional columns that need to be added are produced.

For those who need it running now while the maintainers work on it, the following show work:

ns.orig <- get(".harmonizeDataFrames", envir = asNamespace("minfi"))
.harmonizeDataFrames.quickfix <- function (x, y) 
{
    stopifnot(is(x, "DataFrame"))
    stopifnot(is(y, "DataFrame"))
    x.only <- setdiff(names(x), names(y))
    y.only <- setdiff(names(y), names(x))
    if (length(x.only) > 0) {
        df.add <- x[1, x.only]
        is.na(df.add[1, ]) <- TRUE
        y <- cbind(y, df.add)
    }
    if (length(y.only) > 0) {
        df.add <- data.frame(matrix(ncol = length(y.only), nrow = dim(x)[1]))
        names(df.add) <- y.only
        x <- cbind(x, df.add)
    }
    list(x = x, y = y[, names(x)])
}
environment(.harmonizeDataFrames.quickfix) <- environment(ns.orig)
attributes(.harmonizeDataFrames.quickfix) <- attributes(ns.orig)
assignInNamespace(".harmonizeDataFrames", .harmonizeDataFrames.quickfix, ns="minfi")

Maybe someone else can confirm.

ewtobi commented 6 years ago

@kennylouie @stephaniehicks I ran into the same problem with FlowSorted.CordBlood.450k. The proposed fix worked, thank you @kennylouie

royfrancis commented 5 years ago

Still getting this error with estimateCellCounts().

library(FlowSorted.CordBlood.450k)
cc <- estimateCellCounts(rg,compositeCellType="CordBlood")

[estimateCellCounts] Combining user data with reference (flow sorted) data.

 Error in as(rv, class(v)) : 
  no method or default for coercing "logical" to "factor" 

The above fix from kennylouie doesn't seem to work. I have factors in my rg pData, so I tried to change it all to character.

pData(rg) <- DataFrame(as.data.frame(sapply(pData(rg),as.character),stringsAsFactors=F))

Then I get this error:

[estimateCellCounts] Combining user data with reference (flow sorted) data.

Error in DataFrame(sampleNames = c(colnames(rgSet), colnames(referenceRGset)),  : 
  different row counts implied by arguments
> validObject(rg)
[1] TRUE
> validObject(FlowSorted.CordBlood.450k)
[1] TRUE