yangao07 / abPOA

abPOA: an SIMD-based C library for fast partial order alignment using adaptive band
MIT License
118 stars 18 forks source link

MSA to GFA #17

Closed dbrami closed 3 years ago

dbrami commented 3 years ago

Hello, Thank you for this wonderful, cross-platform tool! Is there a way to use this tool to go from an existing MSA to GFA? This would be my "starter" graph; Then I would use your upcoming "incremental" option to add either new MSA's or new sequences to it...

yangao07 commented 3 years ago

Are you referring to the MSA format output of abPOA or other styles of MSA?

dbrami commented 3 years ago

Hi, I’m thinking of a fasta style MSA

On Thu, Feb 4, 2021 at 5:34 PM Yan Gao notifications@github.com wrote:

Are you referring to the MSA format output of abPOA or other styles of MSA?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/yangao07/abPOA/issues/17#issuecomment-773355175, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABH7W7A3ZE3GSTPPXH6VLDS5KV6NANCNFSM4XAVRBWQ .

-- Sent from Gmail Mobile

yangao07 commented 3 years ago

Can you describe in more details, or give an example?

dbrami commented 3 years ago

Of course, Here's an example of Coronavirus alignment from GISAID first_20_msa_0201.fasta.gz

dbrami commented 3 years ago

Something that does the same thing as https://github.com/hanzou666/msa2gfa But the above is a pure python implementation with a bunch of for loops - very slow.

subwaystation commented 3 years ago

If you just want to go from MSA to GFA, then you could try out vg construct.

construct from a multiple sequence alignment:
    -M, --msa FILE         input multiple sequence alignment
    -F, --msa-format       format of the MSA file (options: fasta, clustal; default fasta)
    -d, --drop-msa-paths   don't add paths for the MSA sequences into the graph
yangao07 commented 3 years ago

@dbrami I do not plan to add a feature of converting MSA to GFA in abPOA. Right now, the latest abPOA (v1.1.0) can incrementally align sequences to an existing GFA or MSA. For your case, maybe you can align your sequence to the MSA using abPOA directly.

Yan

dbrami commented 3 years ago

Dear @subwaystation and @yangao07, All though your answers are seemingly brief and seemingly simple, these have been a tremendous help. Thank you