kundajelab / phantompeakqualtools

This package computes informative enrichment and quality measures for ChIP-seq/DNase-seq/FAIRE-seq/MNase-seq data. It can also be used to obtain robust estimates of the predominant fragment length or characteristic tag shift values in these assays.
BSD 3-Clause "New" or "Revised" License
55 stars 17 forks source link

another problem: Error: protect(): protection stack overflow #3

Open crazyhottommy opened 7 years ago

crazyhottommy commented 7 years ago

Hi,

another problem with phantompeakqual or with R

Decompressing ChIP file
Decompressing control file
Loading required package: caTools
Reading ChIP tagAlign/BAM file Hs_940_temp//M940_Crep2.tagAlign.gz 
opened Hs_940_temp/M940_Crep2_spp_tmp//M940_Crep2.tagAlign144c7b15c3b9
done. read 42273553 fragments
ChIP data read length 37 
[1] TRUE
Reading Control tagAlign/BAM file Hs_940_.temp/control.tagAlign.gz 
opened Hs_940_.temp/M940_Crep2_spp_tmp//control.tagAlign144c62adab27
done. read 2922304 fragments
Error: protect(): protection stack overflow
Execution halted
Error: protect(): protection stack overflow
Execution halted

I googled and found http://stackoverflow.com/questions/28728774/how-to-set-max-ppsize-in-r

after adding one line

options("experssion" = 500000)

on top of the run_spp.R script, same error. Thanks for looking into this. Tommy

crazyhottommy commented 7 years ago

any idea on this error?

akundaje commented 7 years ago

Not sure. Haven't come across this ever before. Have you tried our automated pipeline https://github.com/kundajelab/chipseq_pipeline

Anshul

On Mar 24, 2017 9:10 AM, "Ming Tang" notifications@github.com wrote:

Hi,

another problem with phantompeakqual or with R

Decompressing ChIP file Decompressing control file Loading required package: caTools Reading ChIP tagAlign/BAM file Hs_940_temp//M940_Crep2.tagAlign.gz opened Hs_940_temp/M940_Crep2_spp_tmp//M940_Crep2.tagAlign144c7b15c3b9 done. read 42273553 fragments ChIP data read length 37 [1] TRUE Reading Control tagAlign/BAM file Hs940.temp/control.tagAlign.gz opened Hs940.temp/M940_Crep2_spp_tmp//control.tagAlign144c62adab27 done. read 2922304 fragments Error: protect(): protection stack overflow Execution halted Error: protect(): protection stack overflow Execution halted

I googled and found http://stackoverflow.com/ questions/28728774/how-to-set-max-ppsize-in-r

after adding one line

options("experssion" = 500000)

on top of the run_spp.R script, same error. Thanks for looking into this. Tommy

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kundajelab/phantompeakqualtools/issues/3, or mute the thread https://github.com/notifications/unsubscribe-auth/AAI7EamTpLBAxezAC97yTH3CdyNZFYkpks5ro9z0gaJpZM4MoYZv .

crazyhottommy commented 7 years ago

I actually wrote a snakemake version of the pipeline to accommodate our own usage. I know the bolts and nuts of my own pipeline so I can easily extend it.

I will continue to debug...thanks though.

Tommy

leepc12 commented 7 years ago

http://stackoverflow.com/questions/28728774/how-to-set-max-ppsize-in-r

Can you try with higher max stack size "Rscript --max-ppsize=500000 [RFILE]"? It's 5000 by default and should be between 5000 and 5000000.

Thanks,

Jin

On Mon, Mar 27, 2017 at 8:30 PM, Ming Tang notifications@github.com wrote:

I actually wrote a snakemake version of the pipeline to accommodate our own usage. I know the bolts and nuts of my own pipeline so I can easily extend it.

I will continue to debug...thanks though.

Tommy

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kundajelab/phantompeakqualtools/issues/3#issuecomment-289652682, or mute the thread https://github.com/notifications/unsubscribe-auth/AIOd_I6FkkljXKo-YmNy3VkmNFD1L_0Pks5rqH7jgaJpZM4MoYZv .

crazyhottommy commented 7 years ago

@leepc12 I added options("experssion" = 500000) to the run_spp.R script, but still same error. puzzling.

andreamariossi commented 7 years ago

Hi, having the same problem. I tried the suggested solutions Rscript --max-ppsize=500000 [RFILE] or adding options("exppression" = 500000). I tried both on Xubuntu 16.04 LTS and Red Hat 4.4.7-3. thanks, Andrea

seenstevo commented 7 years ago

@leepc12 Hi, I have run into the same problem recently while working with DNase-seq on a new species. I know the program is still working as old tagAlign files are processed fine. My working theory is that it may have something to do with the number of scaffolds/chromosomes (col 1). The reason is that this new species currently has a basic genome assembly so there are many very small scaffolds (something like >200000). I tested two files derived from one file that fails with "Error: protect(): protection stack overflow", one was produced from head -100000 and one from tail -100000. The head file worked fine while the tail one failed with the same error (Error: protect(): protection stack overflow). Given that they are the same size etc and the scaffolds are ordered by size (largest first), the head derived file only has one or two different scaffold/chr while the tail derived file has a few rows per scaffold/chr so has thousands of different ones so this is what I thought might be the culprit.

@andreakcl89 , @crazyhottommy Does this fit with your issue? Has anyone found a fix? Thanks, Sean

oshomroni commented 6 years ago

I have the exact same issue using ATAC_seq workflow provided by the Kundaje lab. I ran it for Xenopus laevis version 9.2, which contains 18 chromosomes and 108015 scaffolds. In terms of proportions, non-mitochondrial chromosomes cover 92% of the genome, a very small percentage is mitochondrial DNA, and 7.7% of the genome is scaffolds. If, as @seenstevo mentioned, the issue is the massive number of scaffolds that the analysis has to run through, maybe it is not too bad to remove 7.7% of the genome to get the analysis done

akundaje commented 6 years ago

The cross correlation analysis is not necessary for ATAC seq data. So you should simply disable that.

Jin - is there a parameter to disable CC analysis for ATAC/DNase

On Dec 6, 2017 12:23 AM, "oshomroni" notifications@github.com wrote:

I have the exact same issue using ATAC_seq workflow provided by the Kundaje lab. I ran it for Xenopus laevis version 9.2, which contains 18 chromosomes and 108015 scaffolds. In terms of proportions, non-mitochondrial chromosomes cover 92% of the genome, a very small percentage is mitochondrial DNA, and 7.7% of the genome is scaffolds. If, as @seenstevo https://github.com/seenstevo mentioned, the issue is the massive number of scaffolds that the analysis has to run through, maybe it is not too bad to remove 7.7% of the genome to get the analysis done

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kundajelab/phantompeakqualtools/issues/3#issuecomment-349568212, or mute the thread https://github.com/notifications/unsubscribe-auth/AAI7EbmogiU1T-MdpB8b6yWqkQ4_4Oocks5s9k7-gaJpZM4MoYZv .

crazyhottommy commented 6 years ago

@seenstevo no, I am processing the same human data. somehow in a different computing cluster, the error disappeared.

oshomroni commented 6 years ago

I added "Rscript --max-ppsize=500000" inside the postalign_xcor.bds script before ${RUN_SPP}, and it worked. Guess you need a good server with tons of memory for this to work.

leepc12 commented 6 years ago

You can disable cross correlation analysis by activating pipeline's flag -no_xcor​.

Jin