davismcc / archive-scater

An archived version of the scater repository, see https://github.com/davismcc/scater for the active version.
64 stars 18 forks source link

Cannot allocate vector of size when running scater_gui #115

Closed ericvon11 closed 7 years ago

ericvon11 commented 7 years ago

Hi,

Thanks for building this awesome package. Unfortunately, I'm unable to utilize the scater_gui function right now. Given my limited computational knowledge, the gui would help a lot. I've created my SCESet from read10XResults (9.2 Mb in R), ran the calculateQCMetrics, and am now trying to run the scater_gui on my SCESet, but after opening the Shiny session in Chrome, the process either hangs and does nothing (no graphs displayed in the scater page), or if I retry it, I often get the error:

Warning: Error in : cannot allocate vector of size 1.3 Gb
Stack trace (innermost first):

    112: unlist
    111: list_to_array
    110: laply
    109: plyr::aaply
    108: t
    107: plotSCESet
    106: plot
    105: plot
    104: renderPlot
     94: <reactive:plotObj>
     83: plotObj
     82: origRenderFunc
     81: output$plot
      4: <Anonymous>
      3: do.call
      2: print.shiny.appobj
      1: <Promise>

I've tried gc() to allocate more space before this step and R goes down to about 1GB used, but after running scater_gui it'll run up the RAM usage to 99% and then drop and hang/throw the error. I'm using a computer with 16GB DDR4 and a Kaby Lake processor. This is a dataset of about 8,500 cells at about 80k reads per cell.

Do I just not have enough memory? Or is there something else wrong? It seems odd that R would use 13-14GB to accomplish this, so I figured it was worth asking here.

Thanks, Eric

LTLA commented 7 years ago

Probably something in scater_gui is doing a manipulation that does not scale well, memory-wise, with an increasing number of cells. Which is not surprising, as some of the darker corners of scater haven't really been stress-tested with a large number of cells.

wikiselev commented 7 years ago

I will have a look at it today, it shouldn't do anything different from what you do in your R session. Probably function calls in GUI haven't been updated for a while, whereas some function have changed?

wikiselev commented 7 years ago

Ok, it looks like some plotting functions are quite slow for large datasets. For example plotPCA should be ok for ~8000 cells, but plot will be very slow and memory consuming. Since plot is the first function used in the index page of the shiny app it makes the whole app frozen and then since it does not get enough memory it throws an error. What happens if you call plot(your_scater_object) in your R session?

ericvon11 commented 7 years ago

I dropped down to a 2600 cell data set with a 190k mean reads per cell and it does the same. I can plot individual plots, but they do take a while. Thanks for looking into this!

wikiselev commented 7 years ago

@davismcc would be good to tell a user that a dataset is too large to be plotted in the GUI. Alternatively, we could remove the slow plotting functions from the GUI. What do you think?

davismcc commented 7 years ago

This issue was moved to davismcc/scater#14