gtonkinhill / panaroo

An updated pipeline for pangenome investigation
MIT License
269 stars 34 forks source link

ERROR when writing output: double free or corruption (!prev): 0x000055f18142deb0 #251

Closed davidtong28 closed 1 year ago

davidtong28 commented 1 year ago

Version

Installed using Conda as described

conda create -y -n panaroo python=3.9
conda activate panaroo
conda install -y panaroo=1.3.4

Command

panaroo -i ~/wgs_campy/total/bakta1.7/R1S5-[1-9][ABC]/*.gff3 --clean-mode strict --merge_paralogs -o test0

Error

Occured when writing output. Expected output files were written, except for files core_gene_alignment.aln and core_gene_alignment_filtered.aln according to the manual Output:

pre-processing gff3 files...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:02<00:00,  1.92it/s]
running cmd: cd-hit -T 1 -i test0/combined_protein_CDS.fasta -o test0/combined_protein_cdhit_out.txt -c 0.98 -s 0.98 -aL 0.0 -AL 99999999 -aS 0.0 -AS 99999999 -M 0 -d 999 -g 1 -n 2
================================================================
Program: CD-HIT, V4.8.1 (+OpenMP), May 15 2023, 22:49:31
Command: cd-hit -T 1 -i test0/combined_protein_CDS.fasta -o
         test0/combined_protein_cdhit_out.txt -c 0.98 -s 0.98
         -aL 0.0 -AL 99999999 -aS 0.0 -AS 99999999 -M 0 -d 999
         -g 1 -n 2

Started: Thu Oct 12 19:35:34 2023
================================================================
                            Output
----------------------------------------------------------------
Your word length is 2, using 5 may be faster!
total seq: 8474
longest and shortest : 1932 and 25
Total letters: 2580143
Sequences have been sorted

Approximated minimal memory consumption:
Sequence        : 3M
Buffer          : 1 X 16M = 16M
Table           : 1 X 0M = 0M
Miscellaneous   : 0M
Total           : 20M

Table limit with the given memory limit:
Max number of representatives: 4000000
Max number of word counting entries: 240466000

comparing sequences from          0  to       8474
........
     8474  finished       1745  clusters

Approximated maximum memory consumption: 22M
writing new database
writing clustering information
program completed !

Total CPU time 1.62
generating initial network...
Processing paralogs...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 852.15it/s]
collapse mistranslations...
Processing depth:  1
Iteration:  1
100%|█████████████████████████████████████████████████████████████████████████████████████████| 1746/1746 [00:00<00:00, 66675.66it/s]
Iteration:  2
100%|██████████████████████████████████████████████████████████████████████████████████████████████| 17/17 [00:00<00:00, 3871.17it/s]
Iteration:  3
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 65196.44it/s]
Processing depth:  2
Iteration:  1
100%|████████████████████████████████████████████████████████████████████████████████████████| 1719/1719 [00:00<00:00, 421169.96it/s]
Processing depth:  3
Iteration:  1
100%|████████████████████████████████████████████████████████████████████████████████████████| 1719/1719 [00:00<00:00, 403176.68it/s]
collapse gene families...
Processing depth:  1
Iteration:  1
100%|████████████████████████████████████████████████████████████████████████████████████████| 1719/1719 [00:00<00:00, 295383.20it/s]
Iteration:  2
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 27594.11it/s]
Processing depth:  2
Iteration:  1
100%|████████████████████████████████████████████████████████████████████████████████████████| 1717/1717 [00:00<00:00, 650259.14it/s]
Processing depth:  3
Iteration:  1
100%|████████████████████████████████████████████████████████████████████████████████████████| 1717/1717 [00:00<00:00, 618058.70it/s]
trimming contig ends...
refinding genes...
Number of searches to perform:  18
Searching...
5it [00:06,  1.26s/it]
translating hits...
removing by consensus...
Updating output...
Number of refound genes:  4
collapse gene families with refound genes...
Processing depth:  1
Iteration:  1
100%|████████████████████████████████████████████████████████████████████████████████████████| 1711/1711 [00:00<00:00, 519205.19it/s]
Processing depth:  2
Iteration:  1
100%|████████████████████████████████████████████████████████████████████████████████████████| 1711/1711 [00:00<00:00, 634186.47it/s]
Processing depth:  3
Iteration:  1
100%|████████████████████████████████████████████████████████████████████████████████████████| 1711/1711 [00:00<00:00, 592997.37it/s]
writing output...
*** Error in `/home/davidtong28/.conda/envs/panaroo_1.3.4/bin/python3.9': double free or corruption (!prev): 0x000055f18142deb0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7d1fd)[0x7f2fbafe41fd]
/home/davidtong28/.conda/envs/panaroo_1.3.4/bin/python3.9(PyObject_GC_Del+0x1c7)[0x55f180b93c57]
/home/davidtong28/.conda/envs/panaroo_1.3.4/bin/python3.9(+0x1449ec)[0x55f180bb39ec]
/home/davidtong28/.conda/envs/panaroo_1.3.4/bin/python3.9(+0x13e533)[0x55f180bad533]
/home/davidtong28/.conda/envs/panaroo_1.3.4/bin/python3.9(+0x121a07)[0x55f180b90a07]
/home/davidtong28/.conda/envs/panaroo_1.3.4/bin/python3.9(_PyModule_ClearDict+0xfc)[0x55f180c1300c]
/home/davidtong28/.conda/envs/panaroo_1.3.4/bin/python3.9(+0x209715)[0x55f180c78715]
/home/davidtong28/.conda/envs/panaroo_1.3.4/bin/python3.9(Py_FinalizeEx+0x182)[0x55f180c77602]
/home/davidtong28/.conda/envs/panaroo_1.3.4/bin/python3.9(Py_Exit+0x8)[0x55f180c78318]
/home/davidtong28/.conda/envs/panaroo_1.3.4/bin/python3.9(+0x2067eb)[0x55f180c757eb]
/home/davidtong28/.conda/envs/panaroo_1.3.4/bin/python3.9(PyErr_PrintEx+0x11)[0x55f180c754e1]
/home/davidtong28/.conda/envs/panaroo_1.3.4/bin/python3.9(+0x981eb)[0x55f180b071eb]
/home/davidtong28/.conda/envs/panaroo_1.3.4/bin/python3.9(Py_RunMain+0x38d)[0x55f180c6b3bd]
/home/davidtong28/.conda/envs/panaroo_1.3.4/bin/python3.9(Py_BytesMain+0x37)[0x55f180c3ef07]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f2fbaf88af5]
/home/davidtong28/.conda/envs/panaroo_1.3.4/bin/python3.9(+0x1cfe01)[0x55f180c3ee01]
======= Memory map: ========
55f180a6f000-55f180acb000 r--p 00000000 00:2a 65289962                   /home/davidtong28/.conda/envs/panaroo_1.3.4/bin/python3.9
...(omitting hundreds of lines)...
7f2fb1cf5000-7f2fb1cf8000 r--p 00016000 00:2a 73981416                   /home/davidtong28/.conda/envs/panaroo_1.3.4/lib/libgcc_s.so.1Aborted (core dumped)

This error would also appear when the input files are incorrect.

gtonkinhill commented 1 year ago

Hi,

This looks a bit odd and like it might be a memory issue. Have you been able to run it successfully on another set of files or have you tried running it on a small subset of these files? If not, it might be an issue with the installation which will be harder to debug remotely.

davidtong28 commented 1 year ago

No, the error would constantly appear in any given data, and even invalid data. So I would also consider it to be an installation issue. I'll try using it on another cluster. Luckily I'm not using the alignment files for now and all the other output files look usable.