TimoLassmann / kalign

A fast multiple sequence alignment program.
GNU General Public License v3.0
124 stars 29 forks source link

Segmentation fault for large datasets #30

Closed Shubhangi1397 closed 2 years ago

Shubhangi1397 commented 2 years ago

Hey, I ran multiple sequence alignments on protein sequences of length 7096 for largedatasets (1000's, 100,000's) on clustalO and kalign. I had cleaned the datasets i.e. removed sequences having BJOUXZ characters and duplicate sequences. Running MSA for 15,000 sequences and above gave me segmentation fault for building guide tree on kalign whereas, clustalO ran just fine. [2022-03-10 11:55:18] : LOG : Building guide tree. Segmentation fault

TimoLassmann commented 2 years ago

Could you share the sequences with me so I can reproduce the problem? Thanks

Shubhangi1397 commented 2 years ago

Hey Timo, Thanks for your email. I am currently working on SARS-CoV-2 ORF1ab protein sequences. I have removed the sequences consisting of ambiguous characters (BJOUXZ) and duplicate sequences. Here is the dataset. link: https://drive.google.com/file/d/1fBgjKxpmDXm7gfOYF8JhYIEr8GcF-VpE/view?usp=sharing I have been getting segmentation faults for sequences above 13k. Your input would be valuable. Many thanks Shubhangi Kandwal

TimoLassmann commented 2 years ago

Thanks! I think I identified the problem; working on a solution. This may take a few days.

TimoLassmann commented 2 years ago

Dear Shubhangi Kandwal, I fixed the problem in the latest release (3.3.2). Let me know if this works on your end. Thanks for bringing this to me attention! T

Shubhangi1397 commented 2 years ago

Thank you so much Timo. I will let you know how it goes. Regards Shubhangi

On Mon, 21 Mar 2022 at 05:52, TimoLassmann @.***> wrote:

Dear Shubhangi Kandwal, I fixed the problem in the latest release (3.3.2). Let me know if this works on your end. Thanks for bringing this to me attention! T

— Reply to this email directly, view it on GitHub https://github.com/TimoLassmann/kalign/issues/30#issuecomment-1073506742, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYFGVFRGKKSHSJLVWTWKLRLVBAFCPANCNFSM5QM65I5A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

--

Shubhangi Kandwal

PhD Student in Biochemistry,

School of Biochemistry and Immunology,

Trinity Biomedical Sciences Institute (TBSI),

Trinity College Dublin, University of Dublin

email: @.***

Shubhangi1397 commented 2 years ago

Hey Timo, Thanks for the fix. Kalign is working fine now :) Regards Shubhangi

On Mon, 21 Mar 2022 at 12:08, Shubhangi Kandwal @.***> wrote:

Thank you so much Timo. I will let you know how it goes. Regards Shubhangi

On Mon, 21 Mar 2022 at 05:52, TimoLassmann @.***> wrote:

Dear Shubhangi Kandwal, I fixed the problem in the latest release (3.3.2). Let me know if this works on your end. Thanks for bringing this to me attention! T

— Reply to this email directly, view it on GitHub https://github.com/TimoLassmann/kalign/issues/30#issuecomment-1073506742, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYFGVFRGKKSHSJLVWTWKLRLVBAFCPANCNFSM5QM65I5A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

--

Shubhangi Kandwal

PhD Student in Biochemistry,

School of Biochemistry and Immunology,

Trinity Biomedical Sciences Institute (TBSI),

Trinity College Dublin, University of Dublin

email: @.***

--

Shubhangi Kandwal

PhD Student in Biochemistry,

School of Biochemistry and Immunology,

Trinity Biomedical Sciences Institute (TBSI),

Trinity College Dublin, University of Dublin

email: @.***

TimoLassmann commented 2 years ago

Great. Any other issues let me know!