Closed nhoffman closed 13 years ago
So will the numbering be the ungapped one?
It's implemented as numbered on the ungapped sequence specified - all columns containing a gap in the ref seq are removed, then a normal --cut
is applied.
But now that I think about it, that's wrong - all insertions will be removed.
Right, what I had in mind was to just identify the columns in the alignment occupied by the positions specified in the reference sequence, translate to coordinates in the alignment, and cut there. No degapping required.
Thanks, Noah
On Sep 28, 2011, at 2:43 PM, Connor McCoyreply@reply.github.com wrote:
But now that I think about it, that's wrong - all insertions will be removed.
Reply to this email directly or view it on GitHub: https://github.com/fhcrc/seqmagick/issues/22#issuecomment-2230764
Great - that's happening now.
It would be useful to be able to specify the name of a sequence within an alignment to treat as a numbering standard relative to which --cut is applied. For example, to isolate the V3 from an alignment of hiv genomes (assuming the alignment contains a sequence named "HXB2"):
seqmagick convert --cut 7110:7216 --relative-to HXB2 hiv_genomes.fasta env_v3.fasta