fhcrc / seqmagick

An imagemagick-like frontend to Biopython SeqIO
http://seqmagick.readthedocs.org
GNU General Public License v3.0
113 stars 22 forks source link

Deduplication using seqmagick #69

Closed dot4822 closed 6 years ago

dot4822 commented 6 years ago

To whom it may concern,

I am trying seqmagick to deduplicate nucleotide alignment in fasta file. However, I can only make "--deduplicate-sequences" work for deduplication in place. Since I would like to write all the unique alignments out into another file, I tried "--deduplicated-sequences-file", but it will generate empty file, or file only contaning sequence IDs.

If it is possible, could you give me an example about these deduplication options?

Regards, Dot

matsen commented 6 years ago

Have you tried seqmagick convert?

dot4822 commented 6 years ago

Thanks for your quick responce.

Yes. I tried to use seqmagick convert with "--deduplicated-sequences-file", but it did not work. I'll try to use seqmagick covert with "--deduplicate-sequences" now to see what will happen.

Regards, Dot

matsen commented 6 years ago

For your application you just need seqmagick convert --deduplicate-sequences original.fasta deduped.fasta. Let us know if you have difficulty doing that.