iqbal-lab-org / make_prg

Code to create a PRG from a Multiple Sequence Alignment file
Other
21 stars 7 forks source link

Handling "N" inputs in fasta file #38

Closed AlperYurtseven closed 1 year ago

AlperYurtseven commented 1 year ago

Hi! I have facing problem while running make_prg version = 0.3.0 with fasta files that have "N" (Stands for any Nucleic acid). Is there an option can I use to solve this issue? Does make_prg can only work with bases "A, C, T, G" in fasta files ?

Thank you in advance

mbhall88 commented 1 year ago

We have discussed this many times in the past (sorry am struggling to find the previous discussions). The answer is basically no, we can't handle 'N'. I forget whether we still can handle one but then fail with more @leoisl?

The issue is the combinatorial explosion of paths that happens if we expand an N to all nucleotides. And then gets worse the more Ns in a row there is.

AlperYurtseven commented 1 year ago

Thank you for your quick reply