Open njbernstein opened 3 years ago
Hi,
Sorry we don't have a config ready for this. Maybe have a read at the Nextflow doc here: https://www.nextflow.io/docs/latest/google.html
The main parameters to deal with memory will be nsplit
: the genome (or the target region if you provide a bed file) will be split in nsplit
chunks. Each chunk will be run as a job, where reads will be processed by samtools and converted in text file that will be loaded full in memory by R. The more you increase nsplit
the smaller this file will be.
Hi there,
Do you have any advice about running needlestack on a large number of samples on google cloud?
Any chance you have a config already for it?
Do all reads get loaded into memory at the same time?
Do you ball park know how much ram would be necessary for 1000 samples or even 10,000 samples?