zhpn1024 / ribotish

Ribo-seq TIS Hunter, predicting translation initiation sites and ORFs using riboseq data
http://dx.doi.org/10.1038/s41467-017-01981-8
GNU General Public License v3.0
27 stars 8 forks source link

issues about quality #6

Closed Wang497 closed 2 years ago

Wang497 commented 5 years ago

Hi, When I run the Ribotish for quality, it present a mass of error, just as below: "Traceback (most recent call last): File "/usr/local/bin/ribotish", line 56, in main() File "/usr/local/bin/ribotish", line 34, in main commands[cmd].run(args) File "/usr/local/lib/python2.7/dist-packages/ribotish/run/quality.py", line 85, in run cdsBins = args.bins, numProc = args.numProc, verbose = args.verbose, geneformat = args.genefor mat) File "/usr/local/lib/python2.7/dist-packages/ribotish/zbio/ribo.py", line 980, in lendis for result in len_iter: File "/usr/local/lib/python2.7/dist-packages/ribotish/zbio/ribo.py", line 943, in _lendis_trans if m0 : ism0 = r.is_m0() File "/usr/local/lib/python2.7/dist-packages/ribotish/zbio/bam.py", line 415, in is_m0 if self.read.get_tag('MD')[0] == '0' : return True # mismatch at 0 File "pysam/libcalignedsegment.pyx", line 2392, in pysam.libcalignedsegment.AlignedSegment.get_t ag File "pysam/libcalignedsegment.pyx", line 2434, in pysam.libcalignedsegment.AlignedSegment.get_t ag KeyError: "tag 'MD' not present"

Thank you!

Wang497 commented 5 years ago

My python version is 2.7.12, should I updated to python 3.7?

zhpn1024 commented 5 years ago

The problem is that your bam file do not have 'MD' tag. Try to generate a new bam file with 'MD' tag. You do not need to update to python 3. For STAR, you can set --outSAMattributes All option.

Wang497 commented 5 years ago

Hi, Zhang: I am just processing the TI-seq data sets generated using LTM. With the Ribo-TISH analysis I noticed in my results, the reads counts at the start codon is only about 10% of the total reads counts. So for your data, How is about this value? Is about 80% or 90% percent of the reads is at start codon? Thank you!

zhpn1024 commented 5 years ago

Have you got TI-seq quality result using ribotish quality -t option? There's TIS enrich score in the 4th column. Do you mean this value?

Wang497 commented 5 years ago

Thank you for your reply. I used the ribotish quality -t option. That's the enrich score. But I do not think it related to the TIS/total reads. What I mean is that in the LTM data you processed, how many percentage of the reads in the Start Codon compare with the total reads amount. Because I still can see many genes, there is lots of reads along the CDS region. I wondering the LTM works not that efficiency in my experiment.

zhpn1024 commented 5 years ago

The score is calculated as TIS count / average count. It is not the percentage. The actural TIS percentage is not large because the ORF lengths are not short. In fact, it's not easy for LTM to reach high TIS enrich score. Shubin Qian's PNAS LTM data may be the best. Harr is easier, but Harr also has its own problems.