Open szimmerman92 opened 2 months ago
I have the same error when running a NT search in mmseq2 NT NCBI database. I am running on our internal server with 256 GB memory.
I've encountered segfault errors with mmseqs due to not enough memory (which is a valid reason for segfaults, according to quick web search). Large databases like NT/GTDB might need around 900GB RAM, so I would guess too little RAM is the reason in your cases as well.
Expected Behavior
easy-search should finish execution without errors
Current Behavior
Error during pre-filter step
Steps to Reproduce (for bugs)
First create a custom nucleotide database
mmseqs createdb --dbtype 2 --compressed 1 refseq_bacteria_archaea_fungi_viral.fna.gz seqTaxDB
mmseqs createtaxdb seqTaxDB tmp --ncbi-tax-dump ncbi-taxdump --tax-mapping-file fastaid_taxid.tsv
Next run easy-search
mmseqs easy-search all_nuc.fasta seqTaxDB tax_assignments.txt tmp --search-type 3 --min-seq-id 0.65 -e 0.01 -c 0.8 --cov-mode 2 --threads 16
MMseqs Output (for bugs)
Below is the output of
easy-search
Context
Hi I am trying to run an nucleotide-nucleotide search in mmseq2 with a custom database. This error does not occur with a different, smaller nucleotide database.
Thank you very much for this amazing tool and all your hard work.
Your Environment
I am using a google cloud VM with 64 CPUs and 416 GBs of memory on an ubuntu operating system, version 20.04.
I install mmseq with the command
static build with AVX2 (fastest)
wget https://mmseqs.com/latest/mmseqs-linux-avx2.tar.gz; tar xvfz mmseqs-linux-avx2.tar.gz; export PATH=$(pwd)/mmseqs/bin/:$PATH