phyloacc / PhyloAcc

PhyloAcc a software to detect the changes of conservation of a genomic region
GNU General Public License v3.0
26 stars 12 forks source link

PhyloAcc-ST Segmentation fault (core dumped) #45

Closed Simin-CHAI closed 1 year ago

Simin-CHAI commented 1 year ago

Hi, authors

When I ran snakemake on cluster, the error "one of the commands exited with non-zero exit code" occurred. Before this, phyloacc.py worked well and "PhyloAcc job files successfully generated".

To find out the reason, I tried to run a PhyloAcc-ST command following the screen print from snakemake --dryrun. However, a "Segmentation fault (core dumped)" was there.

By gdb PhyloAcc-ST core.XXX, info like:

0 0x00007f2552b2e8c1 in __strlen_sse2_pminub () from /Share/home/chaisimin/miniconda3/envs/phyloacc-env/bin/../lib/libc.so.6

1 0x00005564af429526 in length (__s=0x0)

2 assign (__s=0x0, this=0x5564af500a90) at src/PhyloAcc-ST//newick.cpp:1451

3 operator= (__s=0x0, this=0x5564af500a90) at src/PhyloAcc-ST//newick.cpp:689

4 TravelTree3 (root=0x5564af4f1a30, phylo_tree=...) at src/PhyloAcc-ST//newick.cpp:415

5 0x00005564af4294f0 in TravelTree3 (root=0x5564af4f19b0, phylo_tree=...)

6 0x00005564af4294f0 in TravelTree3 (root=0x5564af4f1330, phylo_tree=...)

7 0x00005564af4294f0 in TravelTree3 (root=0x5564af4ef6e0, phylo_tree=...)

8 0x00005564af4294f0 in TravelTree3 (root=0x5564af4eb3d0, phylo_tree=...)

9 0x00005564af4294f0 in TravelTree3 (root=0x5564af4ed6a0, phylo_tree=...)

10 0x00005564af429f62 in LoadPhyloTree(std::__cxx11::basic_string<char, std::char_traits, std::allocator >) ()

11 0x00005564af3f877c in main ()

12 0x00007f25529e1555 in __libc_start_main () from /Share/home/chaisimin/miniconda3/envs/phyloacc-env/bin/../lib/libc.so.6

13 0x00005564af3feae5 in _start ()

I have no idea to solve it now. Could you please help me find out possible problems and provide some solutions? Thanks a lot and have a good day.

gwct commented 1 year ago

Hello, Hmm, that error is difficult to parse, but from what you sent it seems to have problems reading the tree. Would it be possible to send the relevant input files so I can try and re-create this error? Particularly the config and tree file would be helpful just to see, but if you can send the sequence file as well that would really help me figure it out. Also, if you can send the commands you ran for both phyloacc.py and PhyloAcc-ST that would help too. Thanks. -Gregg

Simin-CHAI commented 1 year ago

Hello, Hmm, that error is difficult to parse, but from what you sent it seems to have problems reading the tree. Would it be possible to send the relevant input files so I can try and re-create this error? Particularly the config and tree file would be helpful just to see, but if you can send the sequence file as well that would really help me figure it out. Also, if you can send the commands you ran for both phyloacc.py and PhyloAcc-ST that would help too. Thanks. -Gregg

Thank you for your quick reply and support. Because of the restriction of attachment type, I send an email including files and my commands which you mentioned :)

Best wishes, Simin

gwct commented 1 year ago

So it does seem to be crashing when it is reading the tree. I don't see anything obviously wrong with the tree's format, but the branch lengths are extremely large for this type of analysis. How did you infer these branch lengths? They should be scaled to the relative rate of neutral substitutions, which are usually quite small (<1.0).

gwct commented 1 year ago

I see your notes in your script now and it seems like you inferred the branch lengths with phyloFit as we recommend. However, something seems to be going wrong with that to give such large branch lengths. The sequence you input for phyloFit is _allSeq_noGap_oneLine.fas. How did you get these alignments and why and how were the gaps removed? Is it possible when the gaps were removed it messed up the alignment columns?

Simin-CHAI commented 1 year ago

I see your notes in your script now and it seems like you inferred the branch lengths with phyloFit as we recommend. However, something seems to be going wrong with that to give such large branch lengths. The sequence you input for phyloFit is _allSeq_noGap_oneLine.fas. How did you get these alignments and why and how were the gaps removed? Is it possible when the gaps were removed it messed up the alignment columns?

There is problem in .mod based on my _allSeq_noGap_oneLine.fas, which had problem removing gaps. However, new .mod file with normal branch lengths and a correctly prepared .fas file produce a similar error just like before... Could you please find out more possible reason of this "PhyloAcc-ST Segmentation fault (core dumped)"? (gdb) where

0 0x00007f1ead5c38c1 in __strlen_sse2_pminub () from /Share/home/chaisimin/miniconda3/envs/phyloacc-env/bin/../lib/libc.so.6

1 0x000055f5f1665526 in length (__s=0x0)

2 assign (__s=0x0, this=0x55f5f1d8f1e0) at src/PhyloAcc-ST//newick.cpp:1451

3 operator= (__s=0x0, this=0x55f5f1d8f1e0) at src/PhyloAcc-ST//newick.cpp:689

4 TravelTree3 (root=0x55f5f1d6d3e0, phylo_tree=...) at src/PhyloAcc-ST//newick.cpp:415

5 0x000055f5f16654f0 in TravelTree3 (root=0x55f5f1d75e60, phylo_tree=...)

6 0x000055f5f16654f0 in TravelTree3 (root=0x55f5f1d75dc0, phylo_tree=...)

7 0x000055f5f16654f0 in TravelTree3 (root=0x55f5f1d75d20, phylo_tree=...)

8 0x000055f5f16654f0 in TravelTree3 (root=0x55f5f1d75380, phylo_tree=...)

9 0x000055f5f16654f0 in TravelTree3 (root=0x55f5f1d74670, phylo_tree=...)

10 0x000055f5f1665f62 in LoadPhyloTree(std::__cxx11::basic_string<char, std::char_traits, std::allocator >) ()

11 0x000055f5f163477c in main ()

12 0x00007f1ead476555 in __libc_start_main () from /Share/home/chaisimin/miniconda3/envs/phyloacc-env/bin/../lib/libc.so.6

13 0x000055f5f163aae5 in _start ()

Thank you so much Simin

gwct commented 1 year ago

The tree looks better, but it still seems to be having trouble reading it. Have you also re-run the phyloacc.py interface with the new tree and alignments? Another thing I noticed is that your alignments have spaces in them. I know that gblocks puts these in, but I'm wondering if that might be messing something up. Did you use a bed file with coordinates that included the spaces? Or did you input separate alignments for each locus? Either way, I think I would try removing the spaces from the alignments and re-running phyloacc.py with those alignments and the new tree to make sure none of those things are causing problems. If you do that and still get an error, send me the new files so I can take a look. -Gregg

Simin-CHAI commented 1 year ago

The tree looks better, but it still seems to be having trouble reading it. Have you also re-run the phyloacc.py interface with the new tree and alignments? Another thing I noticed is that your alignments have spaces in them. I know that gblocks puts these in, but I'm wondering if that might be messing something up. Did you use a bed file with coordinates that included the spaces? Or did you input separate alignments for each locus? Either way, I think I would try removing the spaces from the alignments and re-running phyloacc.py with those alignments and the new tree to make sure none of those things are causing problems. If you do that and still get an error, send me the new files so I can take a look. -Gregg

Hello, I tried your suggestion by removing all the spaces and intervals in alignments and rerunning phyloacc scripts, but the same errors, both in screen output and in the core file generated by Phyloacc-ST, were reported... I sent my input files to you by email. Please have a look. Thank you for your time and patience!

Regards, Simin

gwct commented 1 year ago

Hmm, ok thanks. I'm still getting the error as well, and I'm still not exactly sure why. My best guess is still that it is something to do with the tree, but I'm not sure any more. I don't see anything obvious about it that would lead to an error. I think I'll have to call in @xyz111131 or @HanY-H to take a look at this. Do you mind if I share a minimal example from your data with them (1 or 2 loci and the mod file)?

Simin-CHAI commented 1 year ago

Hmm, ok thanks. I'm still getting the error as well, and I'm still not exactly sure why. My best guess is still that it is something to do with the tree, but I'm not sure any more. I don't see anything obvious about it that would lead to an error. I think I'll have to call in @xyz111131 or @HanY-H to take a look at this. Do you mind if I share a minimal example from your data with them (1 or 2 loci and the mod file)?

Sure thing. Please share several examples with other developers. THANK YOU!!

gwct commented 1 year ago

Just an update to say that I've been in touch with them and they say they'll be able to take a look sometime this weekend.

gwct commented 1 year ago

Hi again, It looks like the problem is that the internal nodes of the tree are not labeled. When I label them I no longer get this error. Thanks to @HanY-H for pointing this out. There's a method for labeling the tree here: https://github.com/xyz111131/PhyloAcc_v1#trouble-shooting

I'll have to add something to the current documentation that refers to this, and we should have a more descriptive error message for this. Sorry this took so long to figure out, and let me know if you have any other questions!

Simin-CHAI commented 1 year ago

Hi again, It looks like the problem is that the internal nodes of the tree are not labeled. When I label them I no longer get this error. Thanks to @HanY-H for pointing this out. There's a method for labeling the tree here: https://github.com/xyz111131/PhyloAcc_v1#trouble-shooting

I'll have to add something to the current documentation that refers to this, and we should have a more descriptive error message for this. Sorry this took so long to figure out, and let me know if you have any other questions!

Hi, it is the node and label thing resulting in this core dumped error. Now Phyloacc-ST works! Thanks for your and your friend's time and patience...