Prune_transcripts.py error

Hi Alex,

Thanks for making the ORF-RATE pipeline public and accessible.

When running prune_transcripts.py with the following parameters in Python 2.7.15: prune_transcripts.py --inbed annot.bed --minlen 27 --maxlen 32 genome.fa test.bam -p 10 --verbose

I am getting the following error :

[2019-09-30 11:59:51] Reading transcriptome and genome [2019-09-30 12:00:05] Parsing sequence and count information

Traceback (most recent call last): File "prune_transcripts.py", line 179, in % (len(tid_summary), lowreads_dropped, len(tid_summary)-lowreads_dropped, opts.minreads, opts.peakfrac))

ValueError: All 46526 transcripts dropped due to too few reads (46526) or too many reads coming from one position (0). Consider decreasing MINREADS (currently 64) or increasing PEAKFR(currently 0.200000), or check validity of input BAM file.

I have done as suggested in the ValueError and decreased MINREADS and increased PEAKFR, but I keep getting the same error no matter what parameters I use. I've checked the input BAM file and it seems okay too. I am quite sure that the problem isn't low coverage. When doing samtools flagstat of the BAM file I get:

203414522 + 0 in total (QC-passed reads + QC-failed reads) 100572976 + 0 secondary 0 + 0 supplementary 0 + 0 duplicates 203414522 + 0 mapped (100.00% : N/A) 0 + 0 paired in sequencing 0 + 0 read1 0 + 0 read2 0 + 0 properly paired (N/A : N/A) 0 + 0 with itself and mate mapped 0 + 0 singletons (N/A : N/A) 0 + 0 with mate mapped to a different chr 0 + 0 with mate mapped to a different chr (mapQ>=5)

Please find attached a sample of the structure of my input files: genome.fa.txt annot.bed.txt

Thank you so much,

Marina

alexfields / ORF-RATER

Prune_transcripts.py error #12