error when aligning direct RNA data during revcomp script

pre-mRNA commented 3 years ago

Trying to use Ultra v0.3 to align some direct RNA sequencing to a reference genome when it throws the error below:

Filtering reads aligned to unindexed regions with minimap2 Done filtering. Reads filtered:68052 batch nt: 75577658 total_nt: 3627727560 77733 Traceback (most recent call last): File "/.local/bin/uLTRA", line 645, in align_reads(args) File "/.local/bin/uLTRA", line 400, in align_reads read_batch_temp_file_rc.write('>{0}\n{1}\n'.format(acc, help_functions.reverse_complement(help_functions.remove_read_polyA_ends(seq, args.reduce_read_ployA, 5)))) File "/.local/lib/python3.7/site-packages/modules/help_functions.py", line 79, in reverse_complement rev_comp = ''.join([rev_nuc[nucl] for nucl in reversed(string)])

File "/.local/lib/python3.7/site-packages/modules/help_functions.py", line 79, in rev_comp = ''.join([rev_nuc[nucl] for nucl in reversed(string)]) KeyError: 'U'

The error is resolved by converting U to T in fastq sequences. Just thought I'd flag this because hopefully many people will use your aligner for direct RNA data :)

ksahlin commented 3 years ago

Great, thanks for this note! We only used cDNA so didn't consider this. It's an easy fix and I will fix this in the next version, hopefully coming soon.

ksahlin commented 3 years ago

new version 0.0.4 fixes this. See https://github.com/ksahlin/ultra/releases/tag/v0.0.4

pre-mRNA commented 2 years ago

Lovely, thank you. I will keep promoting uLTRA to my direct RNA colleagues :)

ksahlin / ultra

error when aligning direct RNA data during revcomp script #4