Error: Convert Alignments died #329

Open pisle0 opened 4 weeks ago

pisle0 commented 4 weeks ago

I am trying to run a foldseek easy-search job with the UniProt database built from foldseek databases Alphafold/UniProt. With two different attempts on different machines (with up to 256GB memory), the job completes prefilter and structurealign, but immediately after the structurealign, it throws the following error trying allocate 415722228266 bytes of memory, and quits with Error: Convert Alignments died:

Can not touch 415722228266 into main memory
[=================================================================] 2.34K 2m 12s 776ms
Time for merging to strualn: 0h 0m 0s 9ms

I want to ask if this is purely a memory availability issue, and if there are ways to apply --split-memory-limit similar in the prefilter step. Thank you

milot-mirdita commented 4 weeks ago

Could you please post the whole log?

pisle0 commented 4 weeks ago

Sure, here is the whole log:

Create directory ./results/foldseek_0/tmp
easy-search ./db/db_0 ../UniProt ./results/foldseek_0/foldseek_0.m8 ./results/foldseek_0/tmp --format-mode 4 --format-output query,target,evalue,gapopen,pident,fident,nident,qstart,qend,qlen,tstart,tend,tlen,alnlen,bits,mismatch,qcov,tcov,qset,qsetid,tset,tsetid,lddt,qtmscore,ttmscore,alntmscore,prob -e 1e-5 --threads 95 

structurealign ./db/db_0 ../UniProt ./results/foldseek_0/tmp/6098992401104940622/search_tmp/9988097761208953728/pref ./results/foldseek_0/tmp/6098992401104940622/search_tmp/9988097761208953728/strualn --tmscore-threshold 0 --lddt-threshold 0 --sort-by-structure-bits 1 --alignment-type 2 --exact-tmscore 0 --sub-mat 'aa:3di.out,nucl:3di.out' -a 1 --alignment-mode 3 --alignment-output-mode 0 --wrapped-scoring 0 -e 1e-05 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --comp-bias-corr-scale 0.5 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:10,nucl:10 --gap-extend aa:1,nucl:1 --zdrop 40 --threads 95 --compressed 0 -v 3 

Can not touch 415722228266 into main memory
[=================================================================] 2.34K 2m 12s 776ms
Time for merging to strualn: 0h 0m 0s 9ms
Time for processing: 0h 7m 4s 177ms
mvdb ./results/foldseek_0/tmp/6098992401104940622/search_tmp/9988097761208953728/strualn ./results/foldseek_0/tmp/6098992401104940622/search_tmp/9988097761208953728/aln 

Time for processing: 0h 0m 0s 1ms
mvdb ./results/foldseek_0/tmp/6098992401104940622/search_tmp/9988097761208953728/aln ./results/foldseek_0/tmp/6098992401104940622/result -v 3 

Time for processing: 0h 0m 0s 3ms
Removing temporary files
rmdb ./results/foldseek_0/tmp/6098992401104940622/search_tmp/9988097761208953728/pref -v 3 

Time for processing: 0h 0m 0s 4ms
Error: Convert Alignments died
milot-mirdita commented 4 weeks ago

Does it crash without --format-mode 4?

pisle0 commented 4 weeks ago

Hi, I tried the following:

  1. without --format-mode 4 but with --format-output kept, same error: Error: Convert Alignments died
  2. without both --format-mode 4 and --format-output, convertalis runs but the Can not touch 415722228266 into main memory still persists.
  3. with --format-mode 4 but without --format-output, same as 2

It looks like it must be one of the extra columns requested in the output causing the crash, do you have a rough idea of which one(s) may cause this issue?

pisle0 commented 3 weeks ago

Hi, I want to follow up to see if you have any insight to this? Could this be incompatibility between db built using previous versions of foldseek? Thank you in advance.

milot-mirdita commented 3 weeks ago

I have currently very limited time to look into this. My best guess right now is that one of these fields is causing a crash for some reason. I would guess that it might be one of the set ones, since these are less well tested:


The databases should remain compatible between versions, so I don't think this is the issue.