soedinglab / MMseqs2

MMseqs2: ultra fast and sensitive search and clustering suite
https://mmseqs.com
GNU General Public License v3.0
1.4k stars 195 forks source link

Segmentation Fault #187

Closed chodarq closed 5 years ago

chodarq commented 5 years ago

Hi,

I'm running MMseq2s to clusterize a huge set of fasta sequences (near 640000). I run the easy-cluster procedure (mmseqs easy-cluster scaffold.fa clustered tmp) but in some point the follow message appear:

Index table: Masked residues: 2748074 Index table: fill Segmentation fault (core dumped) Error: Prefilter step 0 died Error: Search died

Any idea of what happend? Thanks

martin-steinegger commented 5 years ago

Thank you for reporting this. Could you please provide more information like version of MMseqs2, output log, platform, computer specification?

chodarq commented 5 years ago

Sure, sorry. MMseq2 is version '44bde75f0e9f4d0ffc60970bee13347fe89bcb96'. Is running on a Dell computer with 64 cores, in a 5TB partition of HDD and 1Tb of RAM, and with Ubuntu 18.04 LTS. I don't find any log file so I attach you the log of the running prior to the fault ( 1>log.txt) Hope that help, thanks!! log.txt

martin-steinegger commented 5 years ago

It seems that you try to cluster nucleotide sequences that are longer than 65536? You need to increase the sequence length using --max-seq-len.

chodarq commented 5 years ago

Oh..could be. File is an output for a metagenome assembly. There is a limitation in the --max-seq-len option? Thanks!

martin-steinegger commented 5 years ago

The clustering might need slightly more memory with an increased --max-seq-len. But There should be no problem on your computer with the memory. Sequences longer than 2^16 are not yet well tested but clustering should work without any further issues .

chodarq commented 5 years ago

I will try and inform anything. Thanks!

chodarq commented 5 years ago

Just for inform. Three days ago, I use the --max-seq-len parameter with the large of the longest scaffold as the limit. It works in terms of the segmentation fault don't appear, but, it's still running...so, I'm not sure whether I'm processing data or in an infinite loop. Processors are all (64) running at 100% and 8Gb of memory are used. Greetings, Christian.

martin-steinegger commented 5 years ago

Ah good that there is no seg. fault anymore. Could you attach the log output?

chodarq commented 5 years ago

Once finish...sure. I forget to put the log out in 1> ...my mistake.

martin-steinegger commented 5 years ago

Did the run finished? If not, would it be possible to share the data with me?

martin-steinegger commented 5 years ago

I will close this issues for now. Please reopen it if you have an update.