Memory limitations - Githubissues

Hi there, I've been tried to annotate contigs assembled from high-coverage fastq files for a while using different DRAM versions (1.3.5 and 1.4.0.rc3), trusting the memory requirements specified on DRAM wiki page:

If KOfam is used to annotate KEGG and UniRef90 is not used, then less than 50 GB of RAM is required. DRAM can be run with any number of processors on a single node.

These are the number of input sequences for the samples that im trying to annotate locally (my system has 64 GB of RAM), whith an avegare lenght around 600-700 pb. These files range from 70MB to 647MB of size:

Plot11_CoAs.fa: 935196 sequences Plot1617_CoAs.fa: 729800 sequences Plot28_CoAs.fa: 127948 sequences Plot31_CoAs.fa: 840509 sequences Plot3637_CoAs.fa: 830278 sequences

DRAM.py annotate -i './*CoAs.fa' -o annotation --threads 8 --custom_fasta_loc /home/bioinformatica/Desktop/DRAM/DRAM_data/SCycDB_2020Mar_unique.fasta --custom_db_name SCycDB --custom_fasta_loc /home/bioinformatica/Desktop/DRAM/DRAM_data/NCyc_unique.fasta --custom_db_name NCyc --min_contig_size 900

2022-10-13 14:36:48,533 - Retrieved database locations and descriptions
2022-10-13 14:36:48,533 - Annotating Plot31_CoAs
2022-10-13 14:45:52,902 - Turning genes from prodigal to mmseqs2 db
2022-10-13 14:46:00,783 - Getting hits from kofam
2022-10-13 19:56:49,948 - Getting forward best hits from peptidase
2022-10-13 20:07:43,200 - Getting reverse best hits from peptidase
2022-10-13 20:08:04,572 - Getting descriptions of hits from peptidase
2022-10-13 20:08:05,673 - Getting hits from pfam
2022-10-13 20:13:20,861 - Getting hits from dbCAN
2022-10-13 20:18:55,260 - Getting hits from SCycDB
2022-10-13 20:18:55,260 - Getting forward best hits from SCycDB
2022-10-13 20:33:25,543 - Getting reverse best hits from SCycDB
2022-10-13 20:34:47,763 - Getting descriptions of hits from SCycDB
2022-10-13 20:34:49,929 - Getting hits from NCyc
2022-10-13 20:34:49,929 - Getting forward best hits from NCyc
2022-10-13 20:47:24,416 - Getting reverse best hits from NCyc
2022-10-13 20:48:06,088 - Getting descriptions of hits from NCyc
2022-10-13 20:48:12,689 - Merging ORF annotations

After getting all hits, seems to get stuck at the merging ORF step, until the process is finally killed.

I would like to know any minimal memory recommendations for my dataset.

WrightonLabCSU / DRAM

Memory limitations #226