Closed crosenth closed 6 years ago
deenurp orientate_sequences --help usage: deenurp orientate_sequences [-h] [--threads NUM] [--id ID] [--out fasta] [--out_csv csv] [--out_notmatched fasta] fasta fasta
Fix orientation of sequences and output target sequence alignment indexes
positional arguments: fasta input sequences fasta target sequences
optional arguments: -h, --help show this help message and exit --threads NUM number of available threads [all] --id ID alignment identity percent
outputs: --out fasta [stdout] --out_csv csv output csv with columns query,target,tilo,tihi --out_notmatched fasta seqnames that did not match tseqs at id threshold
Needs some test cases.
Allow seq_info input (notmatched_seq_info.csv and matched_seq_info.csv)
https://github.com/fhcrc/deenurp/blob/master/deenurp/subcommands/orientate_sequences.py
Still need some unittests.
TODO: Need to filter out low coverage alignments
Create a script that will reverse complement sequences that are in the wrong orientation. Outputs will be a fasta file with sequences that matched at a certain percent id as well as optional csv with alignment indexes on the target sequence(s). Optional notmatched output will be available as well.