epi2me-labs / wf-human-variation

Other
105 stars 45 forks source link

Error executing process > 'mod:modkit_phase (1)' #184

Closed cihaterdogan closed 3 days ago

cihaterdogan commented 6 months ago

Operating System

Other Linux (please specify below)

Other Linux

Red Hat Enterprise Linux 8.9

Workflow Version

v2.2.0

Workflow Execution

Command line (Cluster)

Other workflow execution

No response

EPI2ME Version

No response

CLI command run

module load singularity/3.7.2 module load java/15.0.2

export NXF_SINGULARITY_CACHEDIR=Path/.singularity export SINGULARITY_TMPDIR=Path/.singularity export NXF_HOME=Path

OUTPUT=s4059246577_results

./nextflow run epi2me-labs/wf-human-variation \ -w ${OUTPUT}/workspace \ -profile singularity \ --bam 'data/s4059246577_merged.bam' \ --basecaller_cfg 'dna_r10.4.1_e8.2_400bps_sup@v4.3.0' \ --mod \ --ref 'hg38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna' \ --sample_name 's4059246577' \ --sv \ --snp \ --str \ --cnv \ --phased \ --threads 16 \ --bam_min_coverage 5 \ --out_dir ${OUTPUT}

Workflow Execution - CLI Execution Profile

None

What happened?

I was running the wf-human-variation pipeline as a job (with Slurm) on my institution's clusters using the above command. But got the following error

ERROR ~ Error executing process > 'mod:modkit_phase (1)'

Caused by: Process mod:modkit_phase (1) terminated with an error exit status (2)

Command executed:

modkit pileup \ chrY_hp.bam \ s4059249500 \ --ref GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \ --partition-tag HP \ --interval-size 1000000 \ --prefix s4059249500.wf_mods.chrY \ --log-filepath modkit.log \ --region chrY \ --filter-threshold 0.78125 0.8613281 \ --threads 4 --combine-strands --cpg

Compress all

for i in ls s4059249500/; do root_name=$( basename $i '.bed' )

modkit saves the file as params.sample_name.wf_mods_haplotype.bed

  # create a new name with the patter params.sample_name.wf_mods.haplotype.bedmethyl
  new_name=$( echo ${root_name} | sed 's/wf_mods_/wf_mods\./' )
  mv s4059249500/${root_name}.bed s4059249500/${new_name}.bedmethyl
  bgzip s4059249500/${new_name}.bedmethyl

done

Command exit status: 2

Command output: (empty)

Command error: error: unexpected argument '0.8613281' found

Usage: modkit pileup [OPTIONS]

For more information, try '--help'.

Relevant log output

executor >  local (1297)
[3d/dfdce2] process > index_ref_fai (1)              [100%] 1 of 1 ✔
[44/668649] process > ingress:checkBamHeaders (1)    [100%] 1 of 1 ✔
[-        ] process > ingress:sortBam                -
[-        ] process > ingress:mergeBams              -
[-        ] process > ingress:catSortBams            -
[-        ] process > ingress:validateIndex          -
[08/1493a4] process > ingress:samtools_index (1)     [100%] 1 of 1 ✔
[50/f5b016] process > ingress:check_for_alignment... [100%] 1 of 1 ✔
[6a/f368e9] process > ingress:minimap2_alignment (1) [100%] 1 of 1 ✔
[0e/2a4326] process > getGenome (1)                  [100%] 1 of 1 ✔
[05/8f26cb] process > cram_cache (1)                 [100%] 1 of 1 ✔
[5f/8bfbd4] process > getAllChromosomesBed (1)       [100%] 1 of 1 ✔
[a3/1f727a] process > mosdepth_input (1)             [100%] 1 of 1 ✔
[27/18c41e] process > getVersions                    [100%] 1 of 1 ✔
[1a/f64212] process > getParams                      [100%] 1 of 1 ✔
[e0/83cba9] process > readStats (1)                  [100%] 1 of 1 ✔
[3c/66bfbc] process > infer_sex (1)                  [100%] 1 of 1 ✔
[fb/1bb250] process > makeAlignmentReport            [100%] 1 of 1 ✔
[-        ] process > failedQCReport                 -
[34/c1685c] process > lookup_clair3_model (1)        [100%] 1 of 1 ✔
[dd/026ccb] process > snp:make_chunks (1)            [100%] 1 of 1 ✔
[6f/037a89] process > snp:pileup_variants (622)      [100%] 632 of 632 ✔
[40/d9f4cd] process > snp:aggregate_pileup_varian... [100%] 1 of 1 ✔
[73/5db1df] process > snp:select_het_snps (10)       [100%] 25 of 25 ✔
[93/510285] process > snp:phase_contig (25)          [100%] 25 of 25 ✔
[1b/27cfdd] process > snp:get_qual_filter (1)        [100%] 1 of 1 ✔
[c1/f06fc8] process > snp:create_candidates (13)     [100%] 25 of 25 ✔
[f2/af49e1] process > snp:evaluate_candidates (465)  [100%] 472 of 472 ✔
[e4/4883d7] process > snp:aggregate_full_align_va... [100%] 1 of 1 ✔
[ed/16e203] process > snp:merge_pileup_and_full_v... [100%] 25 of 25 ✔
[59/d38a0d] process > snp:post_clair_phase_contig... [ 96%] 24 of 25
[07/e52bfe] process > snp:post_clair_contig_haplo... [ 54%] 13 of 24
[-        ] process > snp:cat_haplotagged_contigs    -
[-        ] process > snp:aggregate_all_variants     -
[-        ] process > snp:haploblocks_snp            -
[-        ] process > sv:variantCall:sniffles2       -
[-        ] process > sv:variantCall:filterCalls     -
[-        ] process > sv:variantCall:sortVCF         -
[-        ] process > sv:annotate_sv_vcf             -
[4e/7ef5bc] process > sv:runReport:getVersions       [100%] 1 of 1 ✔
[7e/5b72f4] process > sv:runReport:getParams         [100%] 1 of 1 ✔
[-        ] process > sv:runReport:report            -
[-        ] process > output_sv                      -
[-        ] process > refine_with_sv                 -
[-        ] process > concat_refined_snp             -
[-        ] process > annotate_snp_vcf               -
[-        ] process > concat_snp_vcfs                -
[-        ] process > sift_clinvar_snp_vcf           -
[-        ] process > vcfStats                       -
[f1/0dc753] process > report_snp:getVersions         [100%] 1 of 1 ✔
[16/6362ce] process > report_snp:getParams           [100%] 1 of 1 ✔
[-        ] process > report_snp:makeReport          -
[-        ] process > output_snp                     -
[17/f6cb0c] process > validate_modbam (1)            [100%] 1 of 1 ✔
[b8/5d85ab] process > sample_probs (1)               [100%] 1 of 1 ✔
[5f/71013a] process > mod:modkit_phase (1)           [  0%] 0 of 13
[-        ] process > mod:concat_bedmethyl           -
[be/456d47] process > cnv_spectre:mosdepth (1)       [100%] 1 of 1 ✔
[-        ] process > cnv_spectre:callCNV            -
[-        ] process > cnv_spectre:bgzip_and_index... -
[-        ] process > cnv_spectre:annotate_vcf       -
[63/c5cb97] process > cnv_spectre:getVersions        [100%] 1 of 1 ✔
[3e/c52b63] process > cnv_spectre:add_snp_tools_t... [100%] 1 of 1 ✔
[3f/7686fd] process > cnv_spectre:getParams          [100%] 1 of 1 ✔
[-        ] process > cnv_spectre:makeReport         -
[-        ] process > output_cnv                     -
[8a/a3236e] process > str:call_str (2)               [ 15%] 2 of 13
[-        ] process > str:annotate_repeat_expansions -
[d2/eb19fc] process > str:getVersions                [100%] 1 of 1 ✔
[ff/3a927d] process > str:getParams                  [100%] 1 of 1 ✔
[90/73337b] process > str:bam_region_filter (5)      [ 23%] 3 of 13
[-        ] process > str:bam_read_filter            -
[-        ] process > str:generate_str_content       -
[-        ] process > str:concat_str_vcfs            -
[-        ] process > str:merge_tsv                  -
[-        ] process > str:make_report                -
[-        ] process > output_str                     -
[72/15cdff] process > configure_jbrowse (1)          [100%] 1 of 1 ✔
[-        ] process > combine_metrics_json           -
[c9/18f5d8] process > publish_artifact (8)           [100%] 8 of 8
ERROR ~ Error executing process > 'mod:modkit_phase (1)'

Caused by:
  Process `mod:modkit_phase (1)` terminated with an error exit status (2)

Command executed:

  modkit pileup \
      chrY_hp.bam \
      s4059249500 \
      --ref GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \
      --partition-tag HP \
      --interval-size 1000000 \
      --prefix s4059249500.wf_mods.chrY \
      --log-filepath modkit.log \
      --region chrY \
      --filter-threshold 0.78125 0.8613281 \
      --threads 4 --combine-strands --cpg

  # Compress all
  for i in `ls s4059249500/`; do
      root_name=$( basename $i '.bed' )
      # modkit saves the file as params.sample_name.wf_mods_haplotype.bed
      # create a new name with the patter params.sample_name.wf_mods.haplotype.bedmethyl
      new_name=$( echo ${root_name} | sed 's/wf_mods_/wf_mods\./' )
      mv s4059249500/${root_name}.bed s4059249500/${new_name}.bedmethyl
      bgzip s4059249500/${new_name}.bedmethyl
  done

Command exit status:
  2

Command output:
  (empty)

Command error:
  error: unexpected argument '0.8613281' found

  Usage: modkit pileup [OPTIONS] <IN_BAM> <OUT_BED>

  For more information, try '--help'.

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

yes

Other demo data information

No response

cihaterdogan commented 6 months ago

Please also find the log file here log.txt

RenzoTale88 commented 6 months ago

@cihaterdogan could you please share the content of folder:

${OUTPUT}/workspace/b8/5d85ab*/

In particular, can you share the content of .command.env and .command.sh please?

cihaterdogan commented 6 months ago

Hi @RenzoTale88, Thank you for your reply. I was able to run the wf-human-variation pipeline with v2.0.0 and removed the old output due to space limitations. Unfortunately, I am unable to share the files you requested.

RenzoTale88 commented 6 months ago

Ok no worries, I'll close this ticket then. We just released v2.2.2, so if you come across the issue again please re-open this issue with the new comment. Thanks again Andrea

agatafant commented 6 months ago

Hi @RenzoTale88 I am using v2.2.2 but I have exactly the same error. Thanks

RenzoTale88 commented 6 months ago

@agatafant we will need more investigation into this then. Can you please share the nextflow log?

agatafant commented 6 months ago

nextflow.log

RenzoTale88 commented 6 months ago

Can you please check the content of the following files:

/archive/s2/genomics/afant/NANOPORE_analysis/PDM0001NBLDN1R/pipe_wf_human_variation_threshold_latest/work/97/8923b6fe6fde356317df755ccedf3f/.command.env
/archive/s2/genomics/afant/NANOPORE_analysis/PDM0001NBLDN1R/pipe_wf_human_variation_threshold_latest/work/97/8923b6fe6fde356317df755ccedf3f/.command.err
agatafant commented 6 months ago

command.env probs=0.8183594 0.7421875

command.err is empty

thanks

RenzoTale88 commented 6 months ago

Thanks for confirming this. Before progressing with some more tests, did you -resume the run after updating the workflow, or did you start from the beginning?

agatafant commented 6 months ago

I had to run -resume because:

  1. I had to add sniffles options in nextflow.config
  2. when running the first time it gives error when pulling from the repository so I manually did singularity pull --name ontresearch-wf-human-variation-sv-shac591518dd32ecc3936666c95ff08f6d7474e9728.img.pulling.1717507122239 docker://ontresearch/wf-human-variation-sv:shac591518dd32ecc3936666c95ff08f6d7474e9728
RenzoTale88 commented 6 months ago

The issue with this is that it will not retry the sample-probs process, and therefore will fail again with the same error. Can you try removing the following file and try again?

/archive/s2/genomics/afant/NANOPORE_analysis/PDM0001NBLDN1R/pipe_wf_human_variation_threshold_latest/work/97/8923b6fe6fde356317df755ccedf3f/.command.env
RenzoTale88 commented 6 months ago

@agatafant can you provide more details on how you generated the input BAM?

agatafant commented 6 months ago

As you suggested I removed /archive/s2/genomics/afant/NANOPORE_analysis/PDM0001NBLDN1R/pipe_wf_human_variation_threshold_latest/work/97/8923b6fe6fde356317df755ccedf3f/.command.env, I have launched again the pipeline with -resume but I have obtained same error. The bam is obtained with dorado basecaller v.0.6.0

RenzoTale88 commented 6 months ago

@agatafant I was mostly interested in knowing which remora model(s) did you use. Did you use a single model, multiple models, etc

agatafant commented 6 months ago

dorado basecaller -x cuda:all --reference /archive/s1/sconsRequirements/databases/reference/resources_broad_hg38_v0_Homo_sapiens_assembly38.fasta /archive/s2/genomics/afant/PCG_Nanopore/out_PCG_PDM_PDM0001NBLDN1R/dorado0.6.0/dna_r10.4.1_e8.2_400bps_sup@v4.3.0 /archive/genomic_data/Collaboration/PCG_ParkinsonDiseaseMilano/FAST5/PCG_PDM_PDM0001NBLDN1R/PDM0001NBLDN1R/20240409_1716_1G_PAS38058_469a1fd2/pod5/ --modified-bases-models /archive/s2/genomics/afant/PCG_Nanopore/out_PCG_PDM_PDM0001NBLDN1R/dorado0.6.0/dna_r10.4.1_e8.2_400bps_sup@v4.3.0_6mA@v2,/archive/s2/genomics/afant/PCG_Nanopore/out_PCG_PDM_PDM0001NBLDN1R/dorado0.6.0/dna_r10.4.1_e8.2_400bps_sup@v4.3.0_5mC_5hmC@v1

RenzoTale88 commented 6 months ago

Ok this confirms the issue, i.e. the workflow currently can't handle multiple modified bases, and causes to compute multiple filtering thresholds. We will work on a patch, and notify when ready.

agatafant commented 6 months ago

Ok great, thank you. In the meanwhile I re-obtain the BAM with only one modified bases

RenzoTale88 commented 5 months ago

@agatafant we just released wf-human-variation v2.2.4, that should support multiple modification types (e.g. modA and modC). Can you try with this release and let us know if it works for you?

agatafant commented 5 months ago

Thank you @RenzoTale88. I still have this problem when launching the pipeline: I have to remove folder ~/.nextflow/assets/epi2me-labs otherwise it fails with: Pulling epi2me-labs/wf-human-variation... epi2me-labs/wf-human-variation contains uncommitted changes -- cannot pull from repository

But since I have to modify some options in nextflow.config I have to stop the pipeline (is there a better way to do this?), modify the config and then resume, but when resuming it fails again with: Pulling epi2me-labs/wf-human-variation... epi2me-labs/wf-human-variation contains uncommitted changes -- cannot pull from repository

RenzoTale88 commented 5 months ago

Hi @agatafant try do first drop the workflow:

nextflow drop epi2me-labs/wf-human-variation

And then try the analysis again, specifying version 2.2.4:

nextflow run epi2me-labs/wf-human-variation -r v2.2.4 [OPTIONS HERE]
agatafant commented 5 months ago

Hi, I have launched the pipeline, but it continues failing because of: sort: write failed: /var/tmp/pbs.1385700.5kgpsmhpcfe/sortCqvMDv: No space left on device why is it writing in /var/tmp?

attached the log nextflow.log

RenzoTale88 commented 5 months ago

From what I can see, you are using a PBS distributed system right? If so, you need to discuss with your IT support to define an appropriate configuration for your system to avoid this issue.

RenzoTale88 commented 4 months ago

@agatafant did you manage to run the workflow successfully?

agatafant commented 4 months ago

No, I sadly removed --mod option in order to run the pipeline successfully

SamStudio8 commented 3 days ago

@agatafant Sorry you couldn't get --mod working, as @RenzoTale88 mentioned - the workflow does not choose the TMPDIR - this will be down to your cluster configuration. I'm closing this issue now, but please open a new issue if you encounter any further trouble!