rhondabacher / SCnorm

Normalization for single cell RNA-seq data
47 stars 9 forks source link

SCnorm function hangs at start #14

Closed jkleinj closed 6 years ago

jkleinj commented 7 years ago

Dear SCnorm team, thanks for releasing the program, I am testing it out at the moment. I am applying SCnorm to a fairly large data set (8251 genes x 1157 cells):

    DataNorm = SCnorm(counts.num[ , cell_group], Conditions = groupDesign[cell_group],
        OutputName = "group_norm", SavePDF = TRUE,
        FilterCellNum = 5, useSpikes = FALSE, NCores = 4);

The function starts on 4 cores as specified, then reverts to one core and hangs forever (it does not behave like that with your demo data). There is a work-around: when the program is started from within RStudio, that hanging process can be killed and the program continues seemingly normal on 4 cores. Could you comment on that? Kind regards, Jens

rhondabacher commented 7 years ago

Hi Jens,

Thanks for using our package. Did SCnorm output any messages during your run? That will help me understand where this hanging process occurred.

-Rhonda

On Fri, Jun 30, 2017 at 5:14 AM, Jens Kleinjung notifications@github.com wrote:

Dear SCnorm team, thanks for releasing the program, I am testing it out at the moment. I am applying SCnorm to a fairly large data set (8251 genes x 1157 cells):

    OutputName = "group_norm", SavePDF = TRUE,
    FilterCellNum = 5, useSpikes = FALSE, NCores = 4);

The function starts on 4 cores as specified, then reverts to one core and hangs forever (it does not behave like that with your demo data). There is a work-around: when the program is started from within RStudio, that hanging process can be killed and the program continues seemingly normal on 4 cores. Could you comment on that? Kind regards, Jens

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rhondabacher/SCnorm/issues/14, or mute the thread https://github.com/notifications/unsubscribe-auth/AGTh7Ffc3AnYFmCYZ3AuTl9xWlTAntgFks5sJMqcgaJpZM4OKZxC .

-- Rhonda Bacher, PhD Department of Biostatistics Biotechnology Center (2130) 425 Henry Mall University of Wisconsin-Madison Email: rbacher@wisc.edu Website: https://rhondabacher.github.io

nicolee-mctp commented 7 years ago

Hi Rhonda,

I think I may be having a similar issue. I ran SC_norm <- SCnorm(SCnorm_data, Cond, SavePDF = TRUE, FilterCellNum = 10, OutputName = "SCnorm_output", NCores = 8) on my dataset of 16710 genes and 1587 cells. It started on 8 cores for a few seconds, then dropped to 3 cores. After waiting for about half an hour, all that has happened is: Gene filter is applied within each condition. 4452 genes were not included in the normalization due to having less than 10 non-zero values. A list of these genes can be accessed in output, see vignette for example. I am using RStudio. I'm going to try outside of RStudio to see if that's the issue.

Thanks! Nicole

jkleinj commented 7 years ago

Hi Nicole, try to kill the hanging process via its process-ID. In my case RStudio continued the calculation without further problems. Jens

nicolee-mctp commented 7 years ago

Hi Jens,

Thanks so much! That worked for me.

Nicole

klprint commented 7 years ago

Hi!

When I start the normalization, I also see that eight processes (according to the NCores parameter) are spawned, but then an additional process appears which is using 100% CPU and runs forever. All other processes are not used anymore and the single active process runs for hours (even though I just have 350 cells with 18,000 genes). Any suggestions?

Best, Kevin

rhondabacher commented 7 years ago

Hi Kevin,

What version of SCnorm you are running and where (i.e with RStudio, etc.)? I suspect it has something to do with the parallel implementation, although I have not been able to recreate this type of error. My initial suggestion is to try try downloading the development version of SCnorm (uses mclapply for parallel): https://github.com/rhondabacher/SCnorm/tree/devel or the version on Bioconductor (which uses Biocparallel): https://bioconductor.org/packages/devel/bioc/html/SCnorm.html

Otherwise did the solution posted by Jens above work?

Best, Rhonda

zji90 commented 6 years ago

I run into the same problem of job hanging at the first stage for one of the dataset. The output says: "A list of these genes can be accessed in output, see vignette for example." I have waited for more than 30 hours but nothing happens and the program is still hanging. I guess it is probably because of the parallel process. Is it possible to turn off the parallel and just run in normal single core mode?

rhondabacher commented 6 years ago

Hi Zhicheng,

Setting NCores = 1 should work to not create any parallel processes.

I'm curious if you tried any of the previous solutions? I usually suggest that if you have a large number of tied counts to use the ditherCounts=TRUE option. But, in other cases, I think it must be something different in how server systems are set up since I haven't been able to replicate it even on the same datasets. Did you try the development version I suggested above which uses mclapply instead of biocparallel?

Any info. on your experience is helpful since I am still trying to resolve this issue.

Thanks! Rhonda

zji90 commented 6 years ago

I am using the development version you suggested. I have tried both the github and bioconductor versions and they have the same problem (hang at start). And I have tried on two different computing clusters (both unix systems) and the same problem happens. There is only one process related to the job so I guess the previous solution may not work in my case. I think even with NCores = 1 the package still starts a parallel process and potentially causes the problem. I wonder whether it takes much effort to add an argument to SCnorm function so that the users can get rid of the parallel option? For example just replace mclapply to sapply?

rhondabacher commented 6 years ago

Thanks for the feedback! Did the ditherCounts=TRUE option change anything? Also, I'm curious what's the size of your dataset? I'm just trying to identify some commonality between users who run into this issue.

For now, I can create a branch that allows the user to completely turn off the parallel environment. This shouldn't be too difficult, I'll let you know when it's up!

-Rhonda

zji90 commented 6 years ago

Hi Rhonda, Adding ditherCounts=TRUE does solve the problem. I have successfully obtained the results. Just wondering how much will this setting affect the final results?

rhondabacher commented 6 years ago

Hi Zhicheng,

Great to hear! In my testing, the normalized counts should all be within 1% of the value compared to using the ditherCounts = FALSE, and the vast majority within to 0-.5% difference. The option is intended to help the fitting of quantile regression when the data have too many tied counts...it adjusts only by a small value of .01.

Thanks, Rhonda