Open kapsakcj opened 1 year ago
For this particular failure, we are downsampling the FASTQs with RASUSA first, but it doesn't hurt to fix these potential issues anyways
Large fastqs are now supported in fastq-scan (https://github.com/rpetit3/fastq-scan/releases/tag/v1.0.1). But I agree with your approach of subsampling to a reasonable coverage
2GB of RAM ain't enough when your FASTQ files are >11GB in size, like from a NovaSeq.
This line:
https://github.com/theiagen/public_health_viral_genomics/blob/main/tasks/quality_control/task_fastq_scan.wdl#L50
and this line:
https://github.com/theiagen/public_health_viral_genomics/blob/bd7f8a9936ccb3548d2e1d88302b2e0e4b7b8032/tasks/quality_control/task_fastq_scan.wdl#L87
should be upped to at least 8 GB.
Although...when I ran the 11GB FASTQ file through the WDL on the commandline, it consumed upwards of 18GB of RAM, so if Terra kicks in the "memory retry" feature then these files should get processed fine with 2nd or 3rd attempts