dieterich-lab / JACUSA2

New version of JACUSA -> 2.0
GNU General Public License v3.0
23 stars 3 forks source link

JACUSA2 call-2 Exception #29

Closed qwang-big closed 3 years ago

qwang-big commented 4 years ago

Hi Michael,

I have the following error when running JACUSA2 on this dataset: /beegfs/scratch/qw/test/1.dRNA_m6A_yeast_knockout

Here is my command line, rt-arrest runs without problem, but call-2 produced the following exceptions, any idea how to fix it? java -jar ../JACUSA_v2.0.0-RC16.jar call-2 -F 1024 -c 10 -p 10 -D -I -a D,Y -P1 RF-FIRSTSTRAND -P2 RF-FIRSTSTRAND -r sc1.out sc_KO.selected.bam sc_WT.selected.bam -R yeast.S288C.genome.fa

java.lang.ArrayIndexOutOfBoundsException: Index 42 out of bounds for length 42 at lib.phred2prob.Phred2Prob.convert2errorP(Phred2Prob.java:37) at lib.phred2prob.Phred2Prob.colSumErrorProb(Phred2Prob.java:95) at lib.phred2prob.Phred2Prob.colMeanErrorProb(Phred2Prob.java:109) at lib.stat.estimation.provider.pileup.AbstractEstimationContainerProvider.populate(AbstractEstimationContainerProvider.java:113) at lib.stat.estimation.provider.pileup.AbstractEstimationContainerProvider.createData(AbstractEstimationContainerProvider.java:75) at lib.stat.estimation.provider.pileup.AbstractEstimationContainerProvider.convert(AbstractEstimationContainerProvider.java:54) at lib.stat.estimation.provider.pileup.RobustEstimationPileupProvider.convert(RobustEstimationPileupProvider.java:1) at lib.stat.dirmult.CallStat.calculate(CallStat.java:42) at lib.stat.AbstractStat.filter(AbstractStat.java:16) at jacusa.worker.CallWorker.process(CallWorker.java:53) at lib.worker.AbstractWorker.doWork(AbstractWorker.java:139) at lib.worker.AbstractWorker.processReady(AbstractWorker.java:197) at lib.worker.AbstractWorker.run(AbstractWorker.java:213)

thanks! Qi

piechottam commented 4 years ago

Your sequencing data has Phred Score > 41

java.lang.ArrayIndexOutOfBoundsException: Index 42 out of bounds for length 42 at lib.phred2prob.Phred2Prob.convert2errorP(Phred2Prob.java:37)

What kind of sequencing platform is this?

Best Michael

qwang-big commented 4 years ago

HI Michael This is Nanopore platform, we want to see whether JACUSA can work on the data generated in this paper PMID: 32710622. Could I truncate the Phred Score > 41 to 41? Best Qi

CDieterich commented 4 years ago

Hi Qi,

please provide the NCBI SRR run number(s).. Nanopore FASTQ qual values should be around 10-15

Best

Christoph

qwang-big commented 4 years ago

OK the data is here https://www.ncbi.nlm.nih.gov/sra/?term=SRP166020

CDieterich commented 4 years ago

Well, which of the >20 data sets did you use::: SRR Numbers

qwang-big commented 4 years ago

I actually used the bam they processed: https://files.mycloud.com/home.php?brand=webfiles&seuuid=f5887a626fec0b416e77fc9310a7424b&name=1.dRNA_m6A_yeast_knockout.tar

CDieterich commented 4 years ago

Again, answer my question which SRR Numbers ?

qwang-big commented 4 years ago

SRX8120042 and SRX8120043

CDieterich commented 4 years ago

Two trivial observations I made:

1) This is mouse data an not yeast ! So: -R yeast.S288C.genome.fa is wrong. You fix. 2) The FASTQ files on NCBI already contain Phred scores >40 . This is wrong too. You need to clip.

qwang-big commented 4 years ago

Sorry, that was wrong SRR number, I only saw the knockout, the SRR should be: SRX8120037 and SRX8120036. Again, I didn't process the SRR datasets, the data and reference genome I tested are all in https://files.mycloud.com/home.php?brand=webfiles&seuuid=f5887a626fec0b416e77fc9310a7424b&name=1.dRNA_m6A_yeast_knockout.tar as the author provided

CDieterich commented 4 years ago

Well, the author did wrong by providing inflated PHRED scores.

You have to download the FASTQ files and fix the scores yourself, then align and analyse