epi2me-labs / wf-human-variation

Other
94 stars 42 forks source link

ERROR When running MOD analysis only - modkit sample-probs #211

Closed AlbertoOsorio closed 1 day ago

AlbertoOsorio commented 2 weeks ago

Ask away!

Hello. I will proceed to detail the work done so far and the error encountered.

I am using the open data set giab_2023 to learn and test the workflow. I used the wf-basecalling workflow to obtain reads in BAM format (using the model: dna_r10.4.1_e8.2_400bps_hac@v5.0.0). Unfortunately, I forgot to include a Remora model for methylation calling; despite this, I ran wf-human-variation for the rest of the analysis using the BAM file produced initially, and everything went well up to this point.

I ran wf-basecalling a second time, but this time my configuration was as follows due to limited time for GPU usage:

Basecalling model: dna_r10.4.1_e8.2_400bps_fast@v5.0.0 remora_cfg: dna_r10.4.1_e8.2_400bps_hac@v5.0.0_5mC_5hmC@v1 The basecalling proceeded without issue, and I then ran wf-human-variation solely for mod analysis. I am attaching the configuration and the error encountered.

CONFIG: Core Nextflow options revision : master runName : gigantic_allen containerEngine : docker container : [withLabel:wf_humansnp:ontresearch/wf-human-variation-snp:sha17e686336bf6305f9c90b36bc52ff9dd1fa73ee9, withLabel:wf human_sv:ontresearch/wf-human-variation-sv:shac591518dd32ecc3936666c95ff08f6d7474e9728, withLabel:wf_human_mod:ontresearch/modkit:shae137452eb41f5f12b790774 cafe15bc97f48d4d0, withLabel:wf_cnv:ontresearch/wf-cnv:sha428cb19e51370020ccf29ec2af4eead44c6a17c2, withLabel:wf_human_str:ontresearch/wf-human-variation-st r:shadd2f2963fe39351d4e0d6fa3ca54e1064c6ec057, withLabel:snpeff_annotation:ontresearch/snpeff:shadcc812849019640d4d2939703fbb8777256e41ad, withLabel:wf_comm on:ontresearch/wf-common:sha8b5843d549bb210558cbb676fe537a153ce771d6, withLabel:spectre:ontresearch/spectre:sha49a9fe474da9860f84f08f17f137b47a010b1834, def ault:ontresearch/wf-human-variation:sha2b856c1f358ddf1576217a336bc0e9864b6dc0ed] launchDir : /home/copazo workDir : /home/copazo/work projectDir : /home/copazo/.nextflow/assets/epi2me-labs/wf-human-variation userName : copazo profile : standard configFiles : /home/copazo/.nextflow/assets/epi2me-labs/wf-human-variation/nextflow.config

Workflow Options mod : true

Main options) sample_name : HG004_mod bam : bam/bam_methyl.bam ref : /home/copazo/ref/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna bam_min_coverage : 10 phased : true

Advanced Options override_basecaller_cfg: dna_r10.4.1_e8.2_400bps_hac@v5.0.0

ERROR: Aug-13 14:00:08.948 [TaskFinalizer-3] ERROR nextflow.processor.TaskProcessor - Error executing process > 'sample_probs (1)'

Caused by: Process sample_probs (1) terminated with an error exit status (137)

Command executed:

probs=$( modkit sample-probs reads.bam -p 0.1 --interval-size 5000000 --only-mapped --threads 4 2> /dev/null | awk 'NR>1 {ORS=" "; print "--filter-thresho ld "$1":"$3}' )

Command exit status: 137

Command output: (empty)

Work dir: /home/copazo/work/f0/5c22d0eb25bdac0a4941ebcba6a862

I am really thankfull in advance, learning to use this tools has been really fun but it seems like I hit a wall.

RenzoTale88 commented 2 weeks ago

Hi @AlbertoOsorio , error 137 is an out of memory, meaning the process needs additional resources to complete. You can try increasing the single process memory manually by passing a custom configuration file. First, save the following text in a file named e.g. custom.config (it can be any name):

process {
  withName: sample_probs {
    memory 32.GB
  }
}

Then, you can pass it to the workflow with -c custom.config.

AlbertoOsorio commented 2 weeks ago

Thanks, I'll try it !

What is the default value for the single process memory? I am running the workflow on a really compelling dedicated server. It didn't corss my mind that memory might be the issue.

RenzoTale88 commented 2 weeks ago

The default memory for the process is 8.GB, so I'd suggest to try with 16 or more if you have memory availability.

kyleb512 commented 3 days ago

I was having the same issue and ended up here. @RenzoTale88's solution works but is missing an =. Should be as follows:

process {
  withName: sample_probs {
    memory = 32.GB
  }
}
RenzoTale88 commented 1 day ago

Hi @AlbertoOsorio, apologies for the delay. We just released a new version of the workflow that uses modkit v0.3.3. This version fixes a bug that caused high memory usage by the sample-probs stage, and should likely address the issue you encountered. Can you please update the workflow and let us know if you still have memory issues with the processes?

Andrea

AlbertoOsorio commented 1 day ago

Hi, @RenzoTale88 I forgot to update you about the custom config solution. It worked great. Out of memory error appeared again later for a different proccess, I simply added that one to the config file and the workflow finished with no trouble.

With respect to the new update. I could try again during the week with the new update and no config file to see if its solved that way.

RenzoTale88 commented 1 day ago

@AlbertoOsorio thanks, if the previous fix worked then great! We can just close this one, and if you come across more issues, please open a new ticket. All the best Andrea