dentearl / mafTools

Bioinformatics tools for dealing with Multiple Alignment Format (MAF) files.
Other
101 stars 31 forks source link

mafExtractor would split MSA even they should be together #17

Closed zy041225 closed 6 years ago

zy041225 commented 6 years ago

Hi,

I'm dealing with my MSA using mafExtractor, but some output is split into two or more alignments when they should be one.

For example, when running the following command mafExtractor --maf chrX.nonoverlap.maf --seq "hsap.chrX" --start 100067347 --stop 100067714 > split/chrX.nonoverlap.maf.100067347-100067714.maf

one of the MSA origin.txt

are split into multiple parts split.txt

and if I add --soft, it output the same alignment as input without extraction

Please check.

dentearl commented 6 years ago

I'm pretty sure it's doing the right thing here. The sequence you asked for, hsap.chrX, has gaps in it. mafExtractor only cares about that sequence so any insertions are left out.

zy041225 commented 6 years ago

Thanks for the explanation. BTW, will mafExtractor check if there's any gap in reference only?

dentearl commented 6 years ago

Right, it walks the reference and pulls out all blocks that contain the reference and cuts blocks that lack the reference.

On Tue, Dec 5, 2017 at 07:28 zy041225 notifications@github.com wrote:

Thanks for the explanation. BTW, will mafExtractor check if there's any gap in reference only?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/dentearl/mafTools/issues/17#issuecomment-349338686, or mute the thread https://github.com/notifications/unsubscribe-auth/AAkV8vYfGR87hJAD7fZv2nuQSGYCHlEHks5s9WEigaJpZM4QlwhN .