epi2me-labs / wf-human-variation

Other
96 stars 42 forks source link

cnv:makeReport ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() #114

Closed ffavero closed 8 months ago

ffavero commented 10 months ago

Operating System

CentOS 7

Other Linux

No response

Workflow Version

v1.8.3

Workflow Execution

Command line

EPI2ME Version

No response

CLI command run

NXF_TEMP=$TMPDIR \ NXF_SINGULARITY_CACHEDIR=$cachedir \ NXF_HOME=$nfhome \ NXF_ASSETS=$nfhome/assets \ nextflow run /home/projects/cu_10027/data/pipelines/nextflow/epi2me-labs/wf-human-variation/main.nf \ -c $config \ -w ${OUTPUT}/workspace \ -profile singularitypbs \ --snp --sv --mod --str --cnv \ --phase_mod --phase_vcf \ --sex male \ --bed $target_bed \ --ref $ref_fa \ --sample_name $samplename \ --out_dir ${OUTPUT} \ --basecaller_cfg 'dna_r9.4.1_e8_hac@v3.3' \ --bam $cramfile

Workflow Execution - CLI Execution Profile

None

What happened?

I had to increase the memory usage (I set a default 12.GB for each job, and it wasn't enough) for this step in this sample (other samples runs this step correctly, with 12.GB RAM).

But instead of being killed (happend when memory gets out of the requested amount) the job fails with a python exception. It would appear that the are more calls than the 1 expected in the make_report code, for some in this specific sample.

I am running the workflow with singularity and a TORQUE/PBS executor. After several tweaks in the config (maybe I can share it opening another issue?), I managed to run the workflow end-to-end with almost all the analyses triggered on for some samples. So I am inclined to think this issue is an error due to an unexpected rare case scenario, probably it's enough to change the code as suggested by the NumPy/Pandas error message

Relevant log output

RROR ~ Error executing process > 'cnv:makeReport (1)'

Caused by:
  Process `cnv:makeReport (1)` terminated with an error exit status (1)

Command executed:

  workflow-glue cnv_plot             -q ICGC_PCAL45_T01_combined.bed             -o ICGC_PCAL45_T01.wf-human-cnv-report.html             --read_stats ICGC_PCAL45_T01.readstats.tsv.gz            --params params.json             --versions versions             --bin_size 500             --genome hg38             --sample_id ICGC_PCAL45_T01             --noise_plot ICGC_PCAL45_T01_noise_plot.png             --isobar_plot ICGC_PCAL45_T01_isobar_plot.png

Command exit status:
  1

Command output:
  (empty)

Command error:
  INFO:    Converting SIF file to temporary sandbox...
  WARNING: underlay of /etc/localtime required more than 50 (78) bind mounts
  [13:43:10 - workflow_glue] Starting entrypoint.
  Traceback (most recent call last):
    File "/home/projects/cu_10027/data/pipelines/nextflow/epi2me-labs/wf-human-variation/bin/workflow-glue", line 7, in <module>
      cli()
    File "/home/projects/cu_10027/data/pipelines/nextflow/epi2me-labs/wf-human-variation/bin/workflow_glue/__init__.py", line 72, in cli
      args.func(args)
    File "/home/projects/cu_10027/data/pipelines/nextflow/epi2me-labs/wf-human-variation/bin/workflow_glue/cnv_plot.py", line 540, in main
      report = make_report(
    File "/home/projects/cu_10027/data/pipelines/nextflow/epi2me-labs/wf-human-variation/bin/workflow_glue/cnv_plot.py", line 386, in make_report
      if call == cnv_call:
  ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
  INFO:    Cleaning up image...

Work dir:
  /home/projects/cu_10027/projects/prostate/data/data_processed/nanopore/analysis/ICGC_PCAL45_T01/workspace/ac/9adeff906f1b9a56649f18e232c048

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

 -- Check '.nextflow.log' file for details

Application activity log entry

No response

vlshesketh commented 10 months ago

Hi @ffavero, thank you for reporting this - this could be an issue with the way we collate segment data to identify the number of copies of each chromosome.

If it is safe to do so (i.e. the data is not clinically sensitive), are you able to attach ICGC_PCAL45_T01_combined.bed to this ticket, as this will help with troubleshooting?

jessicadlang commented 9 months ago

I am having the same problem. Any resolution to this? Happy to provide my bed file if helpful, but appears I can't upload that file type.

vlshesketh commented 9 months ago

Hi @jessicadlang apologies for the late reply - yes please, if you are able to share the BED file that would greatly help with troubleshooting this issue. Can you try renaming the file extension to .txt and see if it will upload that way?

jessicadlang commented 9 months ago

Here is our .bed file SAMPLE_combined.bed.txt

vlshesketh commented 9 months ago

Thank you! We will investigate and update here as soon as possible.

jessicadlang commented 9 months ago

@vlshesketh, any chance you have some updates on this?

vlshesketh commented 8 months ago

Hi @jessicadlang apologies for the delay with this - I am looking into this currently so should have something to address it shortly.

vlshesketh commented 8 months ago

Hi @jessicadlang and @ffavero - thanks for your patience while I looked into this. There is now a fix for the error you have described, which will be available in the next release. You can test it in the meantime by running the prerelease branch:

nextflow run epi2me-labs/wf-human-variation -r prerelease

The CNV report generation script now includes an additional category for chromosomes with undetermined copy number.

SamStudio8 commented 8 months ago

We believe that this is now fixed in 1.10.1, thanks for your report! Please open a new issue if you run in to any more trouble.