wilkelab / Opfi

A Python package for discovery, annotation, and analysis of gene clusters in genomics or metagenomics data sets.
https://opfi.readthedocs.io/
MIT License
21 stars 5 forks source link

Resolves #71: Use ORF amino acid sequences instead of alignments in output #95

Closed alexismhill3 closed 4 years ago

alexismhill3 commented 4 years ago

The pipeline was saving the alignment character string for each putative hit, since it's returned by BLAST anyway. However, it would probably be better to have the full query sequence, since the alignment sequence that BLAST outputs is often shorter than the query sequence and contains gap characters.

Rather than adding another column to the pipeline output csv, this just replaces the alignment sequence with the ORF amino acid sequence in the output.