clwgg / seqfilter

Filter fasta/fastq(.gz) files by ID and/or sequence length
MIT License
14 stars 2 forks source link

cd seqfilter make

+END_SRC

Negative filtering ('-n') means that all sequences without ID matches are kept (subsequently, if no ID file is supplied, all sequences are without ID matches).

+BEGIN_SRC bash

filter by ID file

seqfilter -i in.fq -l ids.txt -o out.fq

filter by ID file and min length 30

seqfilter -m 30 -i in.fq -l ids.txt -o out.fq

filter only by min length 30

seqfilter -n -m 30 -i in.fq -o out.fq

keep only sequence called "mt"

seqfilter -i in.fa -l <(printf "mt\n") -o mt.fa

remove sequence called "mt" from input

seqfilter -n -i in.fa -l <(printf "mt\n") -o in_no-mt.fa

+END_SRC