Closed AlexShein closed 2 years ago
SARS-Cov-2 genomes? I've seen this before, the sequences are too long. I'm not sure exactly why the segfault, the code is designed to fail with an informative error message, but I'm pretty sure it runs out of memory. I don't know a workaround, sorry.
Hello! Thank you for your response. Correct, those are genomes of various coronaviruses. What (approximately) memory requirements are there for relatively large alignment?
P.s. I managed to get the alignment using some muscle 3.8 version.
Memory scales something like O(N^2 L^2) for N sequences of length L, so this is not a memory-efficient algorithm. IMO well worth it for the improved accuracy. Hopefully I can add some heuristics to reduce memory, something to think about for later versions.
Understood, thank you for replying! :) P.s. and huge thanks for the tool itself.
Hello!
I have downloaded muscle version
5.1,linux_intel64
and ran alignment of this sequences file.Command:
./muscle5.1.linux_intel64 -align corr_analysis/data/35_sequences.fasta -output corr_analysis/aligned_aligned_sequences.fa -refineiters 2
Output:Same issue happens with
muscle_v5.0.1428_linux
.Is there any known workaround for the issue?