bioinfologics / satsuma2

FFT cross-correlation based synteny aligner, (re)designed to make full use of parallel computing
39 stars 13 forks source link

Satsuma2 dies with Segmentation fault (core dumped) #21

Open claudiamartin94 opened 4 years ago

claudiamartin94 commented 4 years ago

Hi, I'm having trouble getting Satsuma to run even on the test dog and human sequences. I get the following segmentation error "/gpfs/home/yuw17aeu/.lsbatch/1572978474.862816.shell: line 15: 8112 Segmentation fault (core dumped) SatsumaSynteny2 -q /gpfs/home/yuw17aeu/satsuma/human.X.part.fasta -t /gpfs/home/yuw17aeu/satsuma/dog.X.part.fasta -o /gpfs/home/yuw17aeu/satsuma/xcorrtest -slaves 10 -threads 4"

My submission script is the following:

/bin/sh

BSUB -n 2

BSUB -q long-eth

BSUB -J satsuma-test

BSUB -R "rusage[mem=10000]" -M 10000

BSUB -oo satsuma-%J.out

BSUB -ee satsuma-%J.err

. /etc/profile module add satsuma2/3.1 module add singularity/3.1.1 export SATSUMA2_PATH=/gpfs/software/satsuma2/3.1/bin

SatsumaSynteny2 -q /gpfs/home/yuw17aeu/satsuma/human.X.part.fasta -t /gpfs/home/yuw17aeu/satsuma/dog.X.part.fasta -o /gpfs/home/yuw17aeu/satsuma/xcorrtest -slaves 10 -threads 4

Any help much appreciated, thanks, Claudia

jonwright99 commented 4 years ago

Hi Claudia, Does it fail straight away with this error? Can you run the test datasets with the default parameters as in the test script test_SatsumaSynteny2 so remove the -slaves parameter (which will default to 1) and use -threads 2. Thanks, Jon

claudiamartin94 commented 4 years ago

Hi Jon, Thanks for commenting so quickly! The job doesn't fail immediately when I run the script above or the test script, and they both have the same error. It manages to sort the Kmer array (and I get output files such as kmatch_results.k15). It loads both of the fasta sequences. It displays an "error on binding" but then seems to continue. Then it collects forward and reverse matches and creates query chunks. Just before it fails it says "Loading target sequence: /gpfs/home/yuw17aeu/satsuma/dog.X.part.fasta

The output files at this point are kmer files 11,13,15,17,19,21 etc. up to 31 and a "satsuma.log" file which just says "Starting".

The way it has been set up on the cluster/HPC here is I need to add the modules each time. Do you think the way it has been installed could be the issue?

When I run the test data I get this which is the same error. == launching workers == == Entering communication loop == comm loop for e0078 3491 worker created, now to work!!! worker created, now to work!!! /gpfs/home/yuw17aeu/.lsbatch/1573037128.868135.shell: line 6: 26635 Segmentation fault (core dumped) SatsumaSynteny2 -q /gpfs/home/yuw17aeu/satsuma/human.X.part.fasta -t /gpfs/home/yuw17aeu/satsuma/dog.X.part.fasta -o /gpfs/home/yuw17aeu/satsuma/xcorr_sample_synt_jon -threads 2

Thank you for your help! Claudia

bjclavijo commented 4 years ago

Error in binding sounds like a network error. Do you have permissions to open ports and such?

Best,

bj

On 6 Nov 2019, at 10:50, claudiamartin94 notifications@github.com wrote:

Hi Jon, Thanks for commenting so quickly! The job doesn't fail immediately when I run the script above or the test script, and they both have the same error. It manages to sort the Kmer array (and I get output files such as kmatch_results.k15). It loads both of the fasta sequences. It displays an "error on binding" but then seems to continue. Then it collects forward and reverse matches and creates query chunks. Just before it fails it says "Loading target sequence: /gpfs/home/yuw17aeu/satsuma/dog.X.part.fasta

Creating target chunks... select=0 chunks=82 chunks: 82 DONE TIME SPENT ON LOADING: 1 == launching workers == == Entering communication loop == comm loop for e0003 3495 worker created, now to work!!! worker created, now to work!!! Segmentation fault (core dumped)" The output files at this point are kmer files 11,13,15,17,19,21 etc. up to 31 and a "satsuma.log" file which just says "Starting".

The way it has been set up on the cluster/HPC here is I need to add the modules each time. Do you think the way it has been installed could be the issue?

When I run the test data I get this which is the same error. == launching workers == == Entering communication loop == comm loop for e0078 3491 worker created, now to work!!! worker created, now to work!!! /gpfs/home/yuw17aeu/.lsbatch/1573037128.868135.shell: line 6: 26635 Segmentation fault (core dumped) SatsumaSynteny2 -q /gpfs/home/yuw17aeu/satsuma/human.X.part.fasta -t /gpfs/home/yuw17aeu/satsuma/dog.X.part.fasta -o /gpfs/home/yuw17aeu/satsuma/xcorr_sample_synt_jon -threads 2

Thank you for your help! Claudia

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bioinfologics/satsuma2/issues/21?email_source=notifications&email_token=AAFQQDMSTPTBLQ22TR4HB33QSKOO7A5CNFSM4JJG2MCKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDGD3FI#issuecomment-550256021, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFQQDNHAOPUEYYNCHC5FKDQSKOO7ANCNFSM4JJG2MCA.

claudiamartin94 commented 4 years ago

Hi Bernardo, I doubt I do, but I'm not sure that this is the same problem as the segmentation fault. Or maybe the binding error triggers problems in the segmentation step.

I may ask the cluster to try and reinstall the program if you think that's the issue. When they initially installed it I was having trouble locating the path to the binaries.

This is what they told me. "Satsuma2 has been installed on HPC as a singularity container image. The reason being it requires Gcc 5.5 and cmake 3.x which is not available on operating system we have (Centos 6). As such the binaries are part of satsuma2.sif (image file) and can not be accessed by user on HPC."

Could you advise how is best to instruct the HPC group to install Satsuma.

Best, Claudia

bjclavijo commented 4 years ago

The network code is not necessarily super robust there, so a binding error may well trigger a segfault when trying to read from an invalid file descriptor or something like that. If you attach a debugger to it it would be trivial to check if that is the case, although I have no clue if your setup allows that.

Best,

bj

On 6 Nov 2019, at 11:36, claudiamartin94 notifications@github.com wrote:

Hi Bernardo, I doubt I do, but I'm not sure that this is the same problem as the segmentation fault. Or maybe the binding error triggers problems in the segmentation step.

I may ask the cluster to try and reinstall the program if you think that's the issue. When they initially installed it I was having trouble locating the path to the binaries.

This is what they told me. "Satsuma2 has been installed on HPC as a singularity container image. The reason being it requires Gcc 5.5 and cmake 3.x which is not available on operating system we have (Centos 6). As such the binaries are part of satsuma2.sif (image file) and can not be accessed by user on HPC."

Could you advise how is best to instruct the HPC group to install Satsuma.

Best, Claudia

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bioinfologics/satsuma2/issues/21?email_source=notifications&email_token=AAFQQDMXYHKUXN43BWFFTFDQSKT3FA5CNFSM4JJG2MCKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDGHTPA#issuecomment-550271420, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFQQDOVMUWV4SRNKM7BNTLQSKT3FANCNFSM4JJG2MCA.

claudiamartin94 commented 4 years ago

I'm afraid I don't understand what you are suggesting. It's unlikely that I am able to use a debugger as I don't have much control over the cluster. The HPC team don't know how to help so I'm a bit stuck.

Do you know of any other software that has similar capabilities? I need to map my bird genome scaffolds to the Zebra finch genome. Currently our genome is in many contigs and it's hard to visualise genome-wide patterns.

Best, Claudia

bjclavijo commented 4 years ago

You can try the old version of satsuma, which is slower but doesn’t use network-based communication but rather disk-based.

Manfred?

Best,

bj

On 6 Nov 2019, at 11:45, claudiamartin94 notifications@github.com wrote:

I'm afraid I don't understand what you are suggesting. It's unlikely that I am able to use a debugger as I don't have much control over the cluster. The HPC team don't know how to help so I'm a bit stuck.

Do you know of any other software that has similar capabilities? I need to map my bird genome scaffolds to the Zebra finch genome. Currently our genome is in many contigs and it's hard to visualise genome-wide patterns.

Best, Claudia

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bioinfologics/satsuma2/issues/21?email_source=notifications&email_token=AAFQQDML255PGE6E6Q2QJSLQSKU33A5CNFSM4JJG2MCKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDGIIFY#issuecomment-550274071, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFQQDNJICFMLRCYJ226CPDQSKU33ANCNFSM4JJG2MCA.

bjclavijo commented 4 years ago

Hi Claudia,

What exactly is the issue (sorry if I missed some previous messages)? Running the old version of Satsuma might not be the best idea, especially on a distributed cluster, where there is a lot of latency in disk-based communication.

Cheers,

On 2019-11-06 16:07, Bernardo Clavijo wrote: You can try the old version of satsuma, which is slower but doesn’t use network-based communication but rather disk-based.

Manfred?

Best,

bj

On 6 Nov 2019, at 11:45, claudiamartin94 notifications@github.com<mailto:notifications@github.com> wrote:

I'm afraid I don't understand what you are suggesting. It's unlikely that I am able to use a debugger as I don't have much control over the cluster. The HPC team don't know how to help so I'm a bit stuck.

Do you know of any other software that has similar capabilities? I need to map my bird genome scaffolds to the Zebra finch genome. Currently our genome is in many contigs and it's hard to visualise genome-wide patterns.

Best, Claudia

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/bioinfologics/satsuma2/issues/21?email_source=notifications&email_token=AAFQQDML255PGE6E6Q2QJSLQSKU33A5CNFSM4JJG2MCKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDGIIFY#issuecomment-550274071, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAFQQDNJICFMLRCYJ226CPDQSKU33ANCNFSM4JJG2MCA.

När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy

claudiamartin94 commented 4 years ago

Hi Manfred,

Thank you for offering to help.

I was having problems getting my own data to run so I have gone back to getting the practice data to work for the dog and human fasta files.

When I run SatsumaSynteny2 on our HPC the job fails due to a Segmentation fault (see below for the error message). I have attempted with many variants of n. slaves, threads and mem. Are you aware of if there is anything I can do to get around this?

== launching workers == worker created, now to work!!! worker created, now to work!!! worker created, now to work!!! worker created, now to work!!! worker created, now to work!!! worker created, now to work!!! worker created, now to work!!!worker created, now to work!!!

== Entering communication loop == comm loop for e0147 3491

<< output from stderr >> /gpfs/home/yuw17aeu/.lsbatch/1572981254.863387.shell: line 15: 5411 Segmentation fault (core dumped) SatsumaSynteny2 -q /gpfs/home/yuw17aeu/satsuma/human.X.part.fasta -t /gpfs/home/yuw17aeu/satsuma/dog.X.part.fasta -o /gpfs/home/yuw17aeu/satsuma/xcorrtest2 -slaves 10 -threads 8 -sl_mem 500

It's likely it could be a problem with how it has been installed on our cluster, but I'm unsure what to tell them that might help.

Best, Claudia

claudiamartin94 commented 4 years ago

The HPC team believe that its this line that causes the segmentation fault..

"/gpfs/software/satsuma2/3.1/bin/HomologyByXCorrSlave -master lgn02 -port 3494 -sid 1 -p 1 -q human.X.part.fasta -t dog.X.p art.fasta -l 0 -q_chunk 4096 -t_chunk 4096 -min_prob 0.99999 -cutoff 1.8 ) = 197"

jonwright99 commented 4 years ago

Hi Claudia, We're working on a new installation process for Satsuma2 which may solve your problem. Give us a couple of days and we'll let you know how it goes and how you can test the new installation. Best, Jon

claudiamartin94 commented 4 years ago

Hi Jon, That's fantastic, thank you. Best, Claudia

claudiamartin94 commented 4 years ago

Hi Jon and Bernardo,

The HPC team have updated our cluster now to Centos 7 and so I am trying to run SatsumaSynteny again. I appear to be missing the "satsuma_run.sh" file in the binary bin/ directory. Could this be a problem with the current configuration? I have the following files...

BlockDisplaySatsuma ChromosomePaint HomologyByXCorr MatchDump MergeXCorrMatches ReverseSatsumaOut SatsumaToGFF ChainMatches ColaAlignSatsuma HomologyByXCorrSlave MatchesByFeature MicroSyntenyPlot SatsumaSynteny2 SortSatsuma Chromosemble FilterGridSeeds KMatch MergeScaffoldsBySynteny OrderOrientBySynteny SatsumaToFASTA

Best, Claudia