rcedgar / muscle

Multiple sequence and structure alignment with top benchmark scores scalable to thousands of sequences. Generates replicate alignments, enabling assessment of downstream analyses such as trees and predicted structures.
https://drive5.com/muscle
GNU General Public License v3.0
186 stars 21 forks source link

Segfault with long sequences #31

Closed rcedgar closed 5 days ago

rcedgar commented 2 years ago

To align a pair of sequences of length L, muscle requires at least 5 x L^2 bytes of memory

The upper limit on L is currently a bit less than 30,000 letters. Unfortunately, this is close to the length of a Cov-2 genome, and an attempt to align Cov-2 genomes with muscle usually fails for this reason.

Currently, muscle segfaults if this limit is exceeded, this is a bug, it should give a graceful error message and exit.

fzyxh commented 1 year ago

So if there any solution to align long sequences of length over 30,000? -super5 doesn't work either.

rcedgar commented 1 year ago

not at the moment, sorry.