marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
644 stars 177 forks source link

How much storage space is required for the assembly of a large genome species? #2302

Closed Jung19911124 closed 2 months ago

Jung19911124 commented 3 months ago

My group is planning the WGS of a wild animal with a large genome size (approximately 6.5 Gb). After the HiFi assembly, we plan to gap-close using Tell-seq reads and CLRs. Due to budgetary constraints, the number of HiFi reads will be x20 of the genome size.

Under these conditions, how much storage space should be reserved on the server when haplotype-aware assembly is performed using HiCanu?

Best, Jung

skoren commented 2 months ago

There's some information on the resources required here: https://canu.readthedocs.io/en/latest/faq.html#what-resources-does-canu-require-for-a-bacterial-genome-assembly-a-mammalian-assembly. Usually a human genome requires approximately 200 Gb w/HiFi data so, given your genome is about 2x larger, I'd estimate 500 Gb - 1 Tb