Open Faizal-Eeman opened 6 months ago
@Faizal-Eeman, congrats on getting 2.5/1.1 TB samples through the pipeline! To help future users with large input files, I summarized the maximum resources actually used for each process:
Maximum values: Tool | realtime | %cpu* | peak_rss |
---|---|---|---|
run_validate_PipeVal | 16h 24m | 64% | 20.MB |
call_sSNV_SomaticSniper | 5d 5h 4m | 85% | 18.GB |
convert_BAM2Pileup_SAMtools | 9d 1h 46m | 90% | 25.GB |
call_sIndel_Manta | 4d 4h 24m | 239% | 5.GB |
call_sSNV_Strelka2 | 23h 33m | 2530% | 20.GB |
call_sSNV_Mutect2 | 1d 7h 32m | 125% | 23.GB |
run_LearnReadOrientationModel_GATK | 22m | 103% | 28.GB |
call_sSNV_MuSE | 1d 9h 27m | 1589% | 119.GB |
run_sump_MuSE | 3m | 172% | 3.GB |
*don't trust the %cpu numbers
Maximum values for Case 2 tumors 1 and 2:
Tumor |
Tool | realtime | %cpu* | peak_rss |
---|---|---|---|---|
tumor1 | call_sSNV_SomaticSniper | 13h 30m | 93% | 3.2 GB |
tumor2 | call_sSNV_SomaticSniper | 12h 47m | 96% | 2.6 GB |
tumor1 | convert_BAM2Pileup_SAMtools | 12h 4m | 95% | 13.7 GB |
tumor2 | convert_BAM2Pileup_SAMtools | 11h 23m | 96% | 13.5 GB |
tumor1 | call_sIndel_Manta | 1d 12h 52m | 79% | < 1 GB |
tumor2 | call_sIndel_Manta | 1d 11h 34m | 79% | < 1 GB |
tumor1 | call_sSNV_Strelka2 | 7h 3m | 703% | 3.3 GB |
tumor2 | call_sSNV_Strelka2 | 6h 27m | 750% | 2.8 GB |
tumor1 | call_sSNV_Mutect2 | 6h 40m | 100% | 2.8 GB |
tumor2 | call_sSNV_Mutect2 | 6h 13m | 100% | 2.5 GB |
tumor1 | run_LearnReadOrientationModel_GATK | 6m | 98% | 2.7 GB |
tumor2 | run_LearnReadOrientationModel_GATK | 8m | 98% | 3.2 GB |
tumor1 | call_sSNV_MuSE | 5h 22m | 1196% | 62.5 GB |
tumor2 | call_sSNV_MuSE | 6h 10m | 1198% | 67.5 GB |
tumor1 | run_sump_MuSE | < 1m | 1427% | 9.1 GB |
tumor2 | run_sump_MuSE | 6m | 1126% | 32.4 GB |
*don't trust the %cpu numbers
Here are the failed logs for the 2TB sample that lead me to the CPU/memory update in the description. The memory allocation was default
for these failed logs and as I identified error code 137 I updated that process's allocation accordingly. I also updated allocation for processes where I anticipated a memory error 137.
/hot/data/unregistered/Zook-Mootor-BNCH-GIAB/analysis/GIAB/AshkenazimParents/somatic-variants/test/failed/call-sSNV-7.0.0/HG002-T/log-call-sSNV-7.0.0-20231102T215524Z
/hot/data/unregistered/Zook-Mootor-BNCH-GIAB/analysis/GIAB/AshkenazimParents/somatic-variants/test/failed/call-sSNV-7.0.0/HG002-T/log-call-sSNV-7.0.0-20231112T041924Z
/hot/data/unregistered/Zook-Mootor-BNCH-GIAB/analysis/GIAB/AshkenazimParents/somatic-variants/test/failed/call-sSNV-7.0.0/HG002-T/log-call-sSNV-7.0.0-20231114T003236Z
/hot/data/unregistered/Zook-Mootor-BNCH-GIAB/analysis/GIAB/AshkenazimParents/somatic-variants/test/failed/call-sSNV-7.0.0/HG002-T/log-call-sSNV-7.0.0-20231115T032828Z
@sorelfitzgibbon @Faizal-Eeman do you guys think it's worth updating the M64 config based on these results? https://github.com/uclahs-cds/pipeline-call-sSNV/blob/main/config/M64.config
@sorelfitzgibbon @Faizal-Eeman do you guys think it's worth updating the M64 config based on these results? https://github.com/uclahs-cds/pipeline-call-sSNV/blob/main/config/M64.config
Yes, it looks like several values can be substantially lowered. I'll work on this.
@Faizal-Eeman it looks like these files have moved, are they still easily accessible? I'd like to check a couple little things, but not urgent.
@Faizal-Eeman it looks like these files have moved, are they still easily accessible? I'd like to check a couple little things, but not urgent.
Yes. I've updated the file paths here now.
It looks like our configs miss a few processes in SomaticSniper (e.g. generate_ReadCount_bam_readcount
) @sorelfitzgibbon
When running very large sample BAMs through call-sSNV, it likely that the pipeline would fail because of default resource configurations.
Although the
base_resouce_update
function intemplate.config
is a great utility to update resources on a case-by-case basis, it is often unclear on how much of the resource is to be updated for a successful run. It would be nice provide examples of resource configurations that worked for large BAMs, perhaps in adoc/
dir of the repo.Here are the resources I set in my pipeline run's config,
Nextflow trace files
Case 1:
Normal - 2.5TB Tumor - 1.1TB - /hot/data/unregistered/Zook-Mootor-BNCH-GIAB/analysis/GIAB/AshkenazimParents/somatic-variants/call-sSNV-7.0.0/HG002-T/log-call-sSNV-7.0.0-20231201T000959Z/nextflow-log/trace.txt
Case 2:
Normal - 369GB