dcouvin / CRISPRCasFinder

A Perl script allowing to identify CRISPR arrays and associated Cas proteins from DNA sequences
https://crisprcas.i2bc.paris-saclay.fr
GNU General Public License v3.0
80 stars 28 forks source link

Error in `muscle': double free or corruption (out): 0x00007ffceb581bf0 #45

Open Wendy361 opened 1 year ago

Wendy361 commented 1 year ago

Hi , I always get error messages when I run CRISPRCasFinder. Is there any way I can fix it? Although I get this error messages, the programma does not stop running and I still can get the results. In this case, I do not know if my result is still reliable.

** Error in `muscle': double free or corruption (out): 0x00007ffceb581bf0 *** ======= Backtrace: ========= /lib64/libc.so.6(+0x81329)[0x2af91d2af329] muscle(+0x4d2b3)[0x55cad3ec72b3] muscle(+0x149bd)[0x55cad3e8e9bd] muscle(+0x15930)[0x55cad3e8f930] muscle(+0xe805)[0x55cad3e88805] /lib64/libc.so.6(__libc_start_main+0xf5)[0x2af91d250555] muscle(+0x12aad)[0x55cad3e8caad]

dcouvin commented 1 year ago

Hi @Wendy361 , Thank you for your message. I have never seen this error before. Could you provide more details about the input file and the command you executed, please? Thanks, David

dcouvin commented 1 year ago

Thank you @Wendy361 for your messages and files. It seems to be an error occurring with muscle version 5 (https://github.com/rcedgar/muscle/issues/34). Please try to cut your multi-fasta input file into separate files. Hope this will help! Best, @dcouvin

Wendy361 commented 1 year ago

David, I'm still encountering the same error even after splitting the multi-fasta input file into separate files. I also tried using the sequence-fasta file that you provided on GitHub, but the error message continues to appear. However, I was still able to obtain the results of the test sequence-fasta file, which were the same as you provided on GitHub. Do you have any suggestions on how to troubleshoot this further?

Wendy361 commented 1 year ago

Would you mind checking the CRISPRCasFinder_message.txt file that I shared with you before, to look at the message below: 'Error in 'muscle': double free or corruption (out): ***'? After the message ====Memory map:=== 2baa5abf1000#, there has a message about muscle appear"muscle 5.1.linux64 [] 1057Gb RAM, 40 cores Built Feb 24 2022 03:16:15 (C) Copyright 2004-2021 Robert C. Edgar. https://drive5.com

Input: 3 seqs, avg length 24, max 24"

Does this mean muscle works?

dcouvin commented 1 year ago

@Wendy361, I think that there is a trouble regarding memory usage when running muscle. Could you try to run the tool on another machine ? Hope this will help. Best, David

Wendy361 commented 1 year ago

Instead of running CRISPRCasFinder on my personal computer, I ran the command in supercomputer (a high-performance computing center )in our university with 1 T RAM and 40 cores.

https://github.com/rcedgar/muscle/issues/57. I got the response from the muscle, would mind taking a look.

dcouvin commented 1 year ago

Thank you! I will also try to add another multiple sequence alignment tool in a next version.

Wendy361 commented 1 year ago

Hi thank you for your anwswer. The error was gone when I used a fasta file containing shorter sequences from Illumina. The previous one is the Nanopore long sequence fasta.

qjoypark commented 1 year ago

So, what's the solution now? if the result is credible when the program showing this messeage?

qjoypark commented 1 year ago

I tried modifying the code by replacing the Calling code of Muscle in Perl, and changing the line from 'muscle -align' to 'muscle -in'. Additionally, I installed version 3.8.x of Muscle, and it appears that everything is now running without any warnings. However, when I compare the output to the online CRISPRcasFinder, I noticed that while the CRISPR array output is the same, the annotation of the Cas protein is missing some information. It looks like that the results are incomplete.

Wendy361 commented 1 year ago

I have not yet found a better solution, but I have observed that using a FASTA file with shorter contig sequences can eliminate the error. However, it still occurs occasionally.

Wendy361 commented 1 year ago

Another thing I forgot to mention is that I barely can find Cas protein in my data and I am not sure if this is related to the error message.