Closed mihinduk closed 5 months ago
Hi Kathie, unfortunately the flye --subassemblies is a bit slow and memory hungry, but I haven't found a good alternative yet :/
What is the max memory available on your nodes? i remember you have 24 CPUs per node but your config is capping the memory at 100gb for the job. We also should look into adding custom scheduler commands in the config so you can pass --exclusive
for those jobs.
I would suggest also trying --assembly cross as megahit is pretty good with memory management.
Hi Mike, I have a dataset of 288 paired fastqs (of which 4 are mock controls). For the non-mock controls, the number of input reads in the fastq files ranges from 265,771 - 197,238,830. I am having trouble getting this through the population assembly step due to memory issues - I am not sure if this is due to the number of input contigs (637,144) or the large size of some contigs (I have 3 contigs larger than 1,000,000 bp: 1,029,230 1,147,109 1,150,404)
I have tried increasing both memory and runtime and wonder if you can give me further advice. I have attached my current contig.yaml files (saved as text files).
Here is the flye log: more /scratch/sahlab/kathie/AHandley/2023_10_05_hecatomb_out/stderr/population_assembly.flye.log
hecatomb.config.yaml.txt config.yaml.txt
Thank you for your help, Kathie