nf-core / sarek

Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
https://nf-co.re/sarek
MIT License
387 stars 401 forks source link

`cnvkit` output files are missing #1625

Open bounlu opened 3 weeks ago

bounlu commented 3 weeks ago

cnvkit output files are missing. There is no segmentation file .cns and scatter and diagram plot files .pdf in the output folder. They are not generated by cnvkit batch module.

FriederikeHanssen commented 3 weeks ago

Please provde your command and respective log files ( in this case the .command.sh of a cnvkit process would also be useful) to investigate. This is not a general issue, since these files are clearly generated in the full size tests https://nf-co.re/sarek/3.4.3/results/sarek/results-e92242ead3dff8e24e13adbbd81bfbc0b6862e4c/test_full_aws/variant_calling/cnvkit/HCC1395T_vs_HCC1395N/

bounlu commented 3 weeks ago

This might be the issue.

bounlu commented 2 weeks ago

When I cd to the work directory and run the segment command, it fails as below:

$ docker run --rm -it -v /data:/data quay.io/biocontainers/mulled-v2-780d630a9bb6a0ff2e7b6f730906fd703e40e98f:c94363856059151a2974dc501fb07a0360cc60a3-0
$ cnvkit.py segment my_sample.cnr -o my_sample.cns
Segmenting with method 'cbs', significance threshold 0.0001, in 1 processes
Traceback (most recent call last):
  File "/usr/local/bin/cnvkit.py", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/site-packages/cnvlib/cnvkit.py", line 10, in main
    args.func(args)
  File "/usr/local/lib/python3.10/site-packages/cnvlib/commands.py", line 994, in _cmd_segment
    results = segmentation.do_segmentation(
  File "/usr/local/lib/python3.10/site-packages/cnvlib/segmentation/__init__.py", line 79, in do_segmentation
    rets = list(
  File "/usr/local/lib/python3.10/site-packages/cnvlib/segmentation/__init__.py", line 123, in _ds
    return _do_segmentation(*args)
  File "/usr/local/lib/python3.10/site-packages/cnvlib/segmentation/__init__.py", line 205, in _do_segmentation
    seg_out = core.call_quiet(
  File "/usr/local/lib/python3.10/site-packages/cnvlib/core.py", line 32, in call_quiet
    raise RuntimeError(
RuntimeError: Subprocess command failed:
$ Rscript --no-restore --no-environ /tmp/tmpula3kw8u

b'Loading probe coverages into a data frame\nWarning message:\nIn CNA(cbind(tbl$log2), tbl$chromosome, tbl$start, data.type = "logratio",  :\n  markers with missing chrom and/or maploc removed\n\nSegmenting the probe data\nError in segment(cna, weights = tbl$weight, alpha = 1e-04) : \n  length of weights should be the same as the number of probes\nExecution halted\n'

This is the reason why no .cns file is generated and the batch command is terminated here.

I notice the weight column is empty in the .cnr file:

$ head my_sample.cnr | column -t
chromosome  start    end      gene        depth  log2         weight
chr1        150500   300849   Antitarget  0      -0.00216322  
chr1        300849   451198   Antitarget  0      -0.00216322  
chr1        451198   601547   Antitarget  0      -0.00216322  
chr1        601547   751897   Antitarget  0      -0.00216322  
chr1        751897   902246   Antitarget  0      -0.00216322  
chr1        902246   1052595  Antitarget  0      -0.00216322  
chr1        1052595  1202944  Antitarget  0      -0.00216322  
chr1        1202944  1353294  Antitarget  0      -0.00216322  
chr1        1353294  1503643  Antitarget  0      -0.00216322 

Seems to be related to this issue.

Here is the quick fix.