zhpn1024 / ribotish

Ribo-seq TIS Hunter, predicting translation initiation sites and ORFs using riboseq data
http://dx.doi.org/10.1038/s41467-017-01981-8
GNU General Public License v3.0
27 stars 8 forks source link

No reads found! #29

Closed noepozzan closed 2 years ago

noepozzan commented 2 years ago

Hi there,

I am trying to run ribotish on regular riboseq data from our lab. Unfortunately, this already fails at the quality step. I mapped the reads to their reference (GRCm39) with a special annotation file from RNAcentral containing annotation for ncRNAs. I used the same gtf below.

I ran:

ribotish quality -b file.bam -g mus_musculus.RNAcentral.gtf --th 0.5

The error I get is:

Counted reads: 0
Error: no reads found! Check read length or protein coding annotation.

The first couple gtf entries look the following way:

1   RNAcentral  transcript  3056358 3056384 .   -   .   transcript_id "URS0000253E70_10090.0"; gene_id "URS0000253E70_10090.0";
1   RNAcentral  exon    3056358 3056384 .   -   .   transcript_id "URS0000253E70_10090.0";
1   RNAcentral  transcript  3056360 3056384 .   -   .   transcript_id "URS000042B783_10090.1"; gene_id "URS000042B783_10090.1";

Do you have any idea where this error may be coming from?

Thanks for your help!

zhpn1024 commented 1 year ago

Haha. The warnings are because the ThickStart/Stop in 7-8 column of bed format represent coding regions, while RNAcentral used it to duplicate Start and Stop range, as if the whole transcript is coding. In fact they are all non-coding. It would not affect results except TISType classification.

noepozzan commented 1 year ago

hmm, I see, haha.. Thanks for the help!