hartleys / QoRTs

Quality of RNA-Seq Toolset
52 stars 14 forks source link

Empty Counts File #69

Open sanchari24 opened 6 years ago

sanchari24 commented 6 years ago

Hello, I am following the pipeline of finding alternative splicing analysis using QoRTs and JunctionSeq. I am working on tomato rna-seq data I have used HISAT2 for alignment and used "--minMAPQ 60". QoRTs ran without errors but the two count files generated are empty. I tried to get the counts only for known splice sites. I have attached the summary, log and gtf file for your reference. Any lead from your side will be very helpful.

ITAG3.2_genomic.gtf.gz QC.summary.txt QC.v1id5eVrCLb8.log

hartleys commented 6 years ago

The QC command requires a GTF file, NOT the flat GFF file.

On Wed, Aug 1, 2018, 8:31 AM sanchari24 notifications@github.com wrote:

Hello, I am following the pipeline of finding alternative splicing analysis using QoRTs and JunctionSeq. I am working on tomato rna-seq data I have used HISAT2 for alignment and used "--minMAPQ 60". QoRTs ran without errors but the two count files generated are empty. I tried to get the counts only for known splice sites. I have attached the summary, log and gtf file for your reference. Any lead from your side will be very helpful.

ITAG3.2_genomic.gtf.gz https://github.com/hartleys/QoRTs/files/2249404/ITAG3.2_genomic.gtf.gz QC.summary.txt https://github.com/hartleys/QoRTs/files/2249399/QC.summary.txt QC.v1id5eVrCLb8.log https://github.com/hartleys/QoRTs/files/2249400/QC.v1id5eVrCLb8.log

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/hartleys/QoRTs/issues/69, or mute the thread https://github.com/notifications/unsubscribe-auth/ACwu7Bm7DCv9nQwqMs5Y2tFtT-hw1_kNks5uMZ-ogaJpZM4VqaLL .

sanchari24 commented 6 years ago

I gave the genomic GTF file attached above as the input, with the commands:

java -jar QoRTs-STABLE.jar QC \ --stranded \ --minMAPQ 60 \ --runFunctions writeKnownSplices,writeNovelSplices,writeSpliceExon \ /home/workstation3/rtgr_rna-seq/bam_files/TFRR_1.sorted.bam \ /home/workstation3/rtgr_rna-seq/input/ITAG3.2_genomic.gtf.gz \ /home/workstation3/rtgr_rna-seq/bam_files/TFRR_1_count \

The "QC.spliceJunctionAndExonCounts.forJunctionSeq.txt.gz" is empty. and I am getting this error: "Error message: "IMMPOSSIBLE STATE! FATAL ERROR! qcJunctionCounts.writeOutput, writing forSpliceSeq" Thanks for your help.

QC.iM2kH5BvR60T.log

hartleys commented 6 years ago

Hrmm. My suspicion is that there are genes in the GTF with unexpected characters.

In particular, colons in gene names would cause this error to occur.

On Thu, Aug 2, 2018, 2:26 AM sanchari24 notifications@github.com wrote:

I gave the genomic GTF file attached above as the input, with the commands:

java -jar QoRTs-STABLE.jar QC --stranded --minMAPQ 60 --runFunctions writeKnownSplices,writeNovelSplices,writeSpliceExon /home/workstation3/rtgr_rna-seq/bam_files/TFRR_1.sorted.bam /home/workstation3/rtgr_rna-seq/input/ITAG3.2_genomic.gtf.gz /home/workstation3/rtgr_rna-seq/bam_files/TFRR_1_count \

The "QC.spliceJunctionAndExonCounts.forJunctionSeq.txt.gz" is empty. and I am getting this error: "Error message: "IMMPOSSIBLE STATE! FATAL ERROR! qcJunctionCounts.writeOutput, writing forSpliceSeq" QC.iM2kH5BvR60T.log https://github.com/hartleys/QoRTs/files/2252331/QC.iM2kH5BvR60T.log

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hartleys/QoRTs/issues/69#issuecomment-409820177, or mute the thread https://github.com/notifications/unsubscribe-auth/ACwu7Mi7x-bi-xAlp_wed42-3KlUJkkoks5uMpuqgaJpZM4VqaLL .

sanchari24 commented 6 years ago

I see, that's not a good news for me. I will scan the gtf file properly and will let you know. Thanks a lot.

sanchari24 commented 6 years ago

Hello, I went back to the older genome version so that I can use the Ensembl gtf file for my organism. QoRTs ran without any errors. However, when I ran JunctionSeq in R, I got the following error: Error in DESeqDataSet(se, design = design, ignoreRank) :> all samples have 0 counts for all genes. check the counting script.

I then checked the file "QC.spliceJunctionAndExonCounts.forJunctionSeq". It had zero counts for all the entries in all the samples. I am unable to understand the reason for this and will be very greatful if you could look into my files. Thanking You QC.summary.txt QC.spliceJunctionAndExonCounts.forJunctionSeq.txt.gz QC.spliceJunctionCounts.knownSplices.txt.gz QC.spliceJunctionCounts.novelSplices.txt.gz

hartleys commented 6 years ago

There are a lot of different things that could have gone wrong.

Can you attach the log?

gk7279 commented 5 years ago

Hi there-

my genomes have underscores and no colons or special characters. but still it terminates with the same error as the original poster. my chromosome names are all matching. my log file is attached. can you please help? qc.log

hartleys commented 5 years ago

Hmm. Very strange. Could I see the GTF file?

Also, try running this:

java [Java Options] -jar QoRTs.jar makeFlatGff --stranded ip.gtf ip.flat.GFF

And maybe post an excerpt from the GFF file?