hansenlab / minfi

Devel repository for minfi
58 stars 70 forks source link

getBeta() long vectors not supported yet #178

Open stewart999 opened 5 years ago

stewart999 commented 5 years ago

Hi When I run getBeta() on an EPIC RGset it fails with the error "long vectors not supported yet". I suspect this is to do with the size of the object. If so is there any round this?

Thanks

`library(minfi) library(IlluminaHumanMethylationEPICmanifest) RGset <- read.metharray.exp(base=NULL, targets=targets, extended=TRUE) pp <- preprocessRaw(RGset) beta <- getBeta(pp)

Error in pmax(...) : long vectors not supported yet: ../../src/include/Rinlinedfuns.h:519

dim(pp) [1] 866091 3412`

kasperdanielhansen commented 5 years ago

To be clear, the preprocessRaw() function call works? It is only getBeta()? Could you run getBeta() and provide the output of traceback() immediately following getBeta()?

stewart999 commented 5 years ago

Yes preprocessRaw() runs without error

> traceback()
12: pmax(...)
11: eval(mc, env)
10: eval(mc, env)
9: eval(mc, env)
8: standardGeneric("pmax")
7: pmax(e1, e2)
6: pmax2(Meth, 0)
5: pmax2(Meth, 0)
4: .betaFromMethUnmeth(Meth = getMeth(object), Unmeth = getUnmeth(object), 
       offset = offset, betaThreshold = betaThreshold)
3: .local(object, ...)
2: getBeta(pp)
1: getBeta(pp)
kasperdanielhansen commented 5 years ago

Thanks.

The underlying issue with this is that you have so many samples that the resulting beta matrix needs to be represented using a so-called "long vector". "Long vectors" is a fairly recent addition to R and is used for matrices where the product of their dimension is greater than .Machine$integer.max (which is true in your case). The problem with long vectors is that they are not (yet) fully supported for all simple functions (in this case pmax() which is component-wise max).

In this case, you should be able to do something like betaA = getBeta(pp[1:400000,]) betaB = getBeta(pp[400001:nrow(pp),]) beta = rbind(betaA, betaB) which is really irritating, I can see that. But it should work and will be quicker than waiting for a fix from me. I might have to do a hack like this internally until pmax() supports long vectors; TBH I am not sure about the best way forward on my end, except that people of course needs to be able to do this.

Unfortunately, you are likely to get similar issues in your downstream analysis. This is pretty irritating, but it is a fundamental current limitation in R.

On Mon, Jan 21, 2019 at 10:56 AM stewart999 notifications@github.com wrote:

Yes preprocessRaw() runs without error

traceback() 12: pmax(...) 11: eval(mc, env) 10: eval(mc, env) 9: eval(mc, env) 8: standardGeneric("pmax") 7: pmax(e1, e2) 6: pmax2(Meth, 0) 5: pmax2(Meth, 0) 4: .betaFromMethUnmeth(Meth = getMeth(object), Unmeth = getUnmeth(object), offset = offset, betaThreshold = betaThreshold) 3: .local(object, ...) 2: getBeta(pp) 1: getBeta(pp)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hansenlab/minfi/issues/178#issuecomment-456120674, or mute the thread https://github.com/notifications/unsubscribe-auth/AEuhnzybbHqeJMJCAabL-W8WDt6kp8UAks5vFeMtgaJpZM4aLCYf .

stewart999 commented 5 years ago

OK, thanks very much for the explanation and for a workaround

-Stewart

kasperdanielhansen commented 5 years ago

Please tell me about other issues you run into with this large sample size. I am very curious.

On Mon, Jan 21, 2019 at 11:50 AM stewart999 notifications@github.com wrote:

OK, thanks very much for the explanation and for a workaround

-Stewart

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hansenlab/minfi/issues/178#issuecomment-456137546, or mute the thread https://github.com/notifications/unsubscribe-auth/AEuhnzJWYMGjSEpYCQn1lnOkaDwS76LMks5vFe--gaJpZM4aLCYf .

kasperdanielhansen commented 5 years ago

Seems like a patch will be made (eventually) for R to address this, so I doubt I will make a work-around.

Best, Kasper

On Mon, Jan 21, 2019 at 11:53 AM Kasper Daniel Hansen < kasperdanielhansen@gmail.com> wrote:

Please tell me about other issues you run into with this large sample size. I am very curious.

On Mon, Jan 21, 2019 at 11:50 AM stewart999 notifications@github.com wrote:

OK, thanks very much for the explanation and for a workaround

-Stewart

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hansenlab/minfi/issues/178#issuecomment-456137546, or mute the thread https://github.com/notifications/unsubscribe-auth/AEuhnzJWYMGjSEpYCQn1lnOkaDwS76LMks5vFe--gaJpZM4aLCYf .

stewart999 commented 5 years ago

OK, thanks again for looking into this. The workaround is working fine

-Stewart

FreshmanJC commented 2 years ago

Could you please tell me whether this error has been resolved?

Error in dim.data.table(x) : long vectors not supported yet: ../../src/include/Rinlinedfuns.h:519