Open arglog opened 4 years ago
That's most likely a different issue. We have a problem if we estimate RAM usage wrong. When that happens performance usually tanks pretty hard. Martin did some improvements recently to lessen this problem but apparently its still a problem.
What are the system specs where you are running this clustering on?
I think the sequence database is just a bit too large to fit into RAM. You could try to use the --compressed 1
parameter to compress each sequence (and all intermediate databases). You will pay a slight cost in runtime for the constant decompression, but that will be more than offset since the sequences will not be constantly evicted from the OS file cache.
Dealing with billions of sequences is still kind of awkward and difficult. We have to improve memory management for these cases.
Summary: Running
easy-linclust
on SRC got stuck after the first call ofrescorediagonal
. No progress and no printed information for ~12h. Not sure if it's related to #323 but since it's a different behavior I just open a new issue.Expected Behavior
Normally exit
Current Behavior
Got stuck after the first call of
rescorediagonal
. No progress and no printed information for ~12h.Steps to Reproduce (for bugs)
MMseqs Output (for bugs)
^^^^^^ There is no more printed info after the last line in the above output, and it got stuck for more than 12h.
Context
Your Environment
Include as many relevant details about the environment you experienced the bug in.
cab0e83840f5afa0632aada56e6bacaf46211c33