deweylab / RSEM

RSEM: accurate quantification of gene and isoform expression from RNA-Seq data
http://deweylab.biostat.wisc.edu/rsem/
GNU General Public License v3.0
408 stars 118 forks source link

pRSEM ValueError: max() arg is an empty sequence #54

Closed crazyhottommy closed 7 years ago

crazyhottommy commented 7 years ago

Hi, I was testing pRSEM, and got error below. do you know why is that?

I only provided one ChIP-seq target fastq, so the IDR is complaining?

Thanks! Tommy

Traceback (most recent call last):
  File "/scratch/genomic_med/apps/RSEM/pRSEM/prsem-calculate-expression", line 74, in <module>
    main()
  File "/scratch/genomic_med/apps/RSEM/pRSEM/prsem-calculate-expression", line 30, in main
    Prsem.genChIPSeqPeakFileBySPPIDR(param)
  File "/scratch/genomic_med/apps/RSEM/pRSEM/Prsem.py", line 53, in genChIPSeqPeakFileBySPPIDR
    cse_target.getPeaksByIDR(cse_control.pooled_tagalign)
  File "/scratch/genomic_med/apps/RSEM/pRSEM/ChIPSeqExperiment.py", line 195, in getPeaksByIDR
    max_npeaks = max(fidr2npeaks.values())
ValueError: max() arg is an empty sequence
crazyhottommy commented 7 years ago

I fooled pRSEM by using the same ChIP-seq target fastq twice (same name), the above error disappeared, but some new error occurred. what's the name convention for the fastq? H3K27ac-rep1.fastq, H3K27ac-rep2.fastq? seems pRSEM is parsing the fastq names.

crazyhottommy commented 7 years ago

update: pRSEM errored at the phantompeakqual step.

# reads with alignments suppressed due to -m: 7411281 (14.13%)
Reported 42273553 alignments to 1 output stream(s)
# reads processed: 52467512
# reads with at least one reported alignment: 42273553 (80.57%)
# reads that failed to align: 2782678 (5.30%)
# reads with alignments suppressed due to -m: 7411281 (14.13%)
Reported 42273553 alignments to 1 output stream(s)
# reads processed: 121283189
# reads with at least one reported alignment: 79412215 (65.48%)
# reads that failed to align: 16999116 (14.02%)
# reads with alignments suppressed due to -m: 24871858 (20.51%)
Reported 79412215 alignments to 1 output stream(s)
Loading required package: caTools
Loading required package: caTools
Loading required package: caTools
Error: protect(): protection stack overflow
Execution halted
Error: protect(): protection stack overflow
Execution halted

Failed with return code 1

Error: protect(): protection stack overflow
Execution halted
Error: protect(): protection stack overflow
Execution halted

Failed with return code 1

Error: protect(): protection stack overflow
Execution halted
Error: protect(): protection stack overflow
Execution halted

Failed with return code 1

File not found: Hs_940_pRSEM.temp/M940_Crep1.tagAlign_VS_control.tagAlign.regionPeak.gz
crazyhottommy commented 7 years ago

any help here for Error: protect(): protection stack overflow? I googled and found http://stackoverflow.com/questions/28728774/how-to-set-max-ppsize-in-r how to solve it in pRSEM?

THanks!

pliu55 commented 7 years ago

Hi @crazyhottommy

Thanks for your interest in using pRSEM. Could you please send me a mocked version of the ChIP-seq FASTQ files your are using so that I can replicate the error on my end? Thanks.

Best, Peng

crazyhottommy commented 7 years ago

Hi, please find the download links, thanks! IP: https://drive.google.com/open?id=0B6NrxMpU91qELTUwNjFKVkJ4SlU Input: https://drive.google.com/open?id=0B6NrxMpU91qEbEx6MGx4aGgyVG8

I do not have replicate for IP, you can just pass it twice to pRSEM, but you may need to make a different name.

Thanks! Tommy

crazyhottommy commented 7 years ago

@pliu55 please let me know if you got a chance to test. Thank you for your help.

Tommy

pliu55 commented 7 years ago

@crazyhottommy

I could not replicate the error at my end. Below is the command I used:

rsem-calculate-expression \
  --estimate-rspd \
  --seed 12345 \
  --num-threads 16 \
  --paired-end \
  --forward-prob 0 \
  --keep-intermediate-files \
  --calc-pme \
  --gibbs-number-of-samples 100 \
  --no-bam-output \
  --star \
  --star-path $starpath \
  --star-gzipped-read-file \
  --run-pRSEM \
  --partition-model pk \
  --chipseq-target-read-files M940_Crep1.fq.gz,M940_Crep2.fq.gz \
  --chipseq-control-read-files M-940-R1-07-G-NC.sorted.fq.gz \
  --bowtie-path $bowtiepath \
  mmliver_1.fq.gz mmliver_2.fq.gz $refdir/$runid $runid

where

Could you please try the above command at your end to see if you still receive the error? And please let me know if more details are needed. Thanks.

crazyhottommy commented 7 years ago

I switched to LSF system and this error disappeared. but I had a new error, opening a new issue.

Thanks! Tommy