etal / cnvkit

Copy number variant detection from targeted DNA sequencing
http://cnvkit.readthedocs.org
Other
540 stars 164 forks source link

Execution halted during segmentation #380

Closed ahwanpandey closed 6 years ago

ahwanpandey commented 6 years ago

Hello,

I am getting this error while using the segment function

(virtualenv_python_2.7.5) [apandey@papr-expanded01 ]$ cnvkit.py segment MAOC00944-3-6.cnr -o MAOC00944-3-6.cns -t 1e5 -p 1
Segmenting with method 'cbs', significance threshold 100000, in 1 processes
Dropped 5 outlier bins:
  chromosome     start       end           gene      log2  depth    weight
0          1  13124981  13125981  RP13-221M14.6 -24.96880  0.000  0.261873
1          1  13686173  13687173              . -23.55250  0.000  0.335485
2          1  17676674  17677674          PADI4  -5.75036  0.927  0.568496
3          1  25692942  25693942           RHCE  -2.42823  1.680  0.176658
4          1  89476940  89477940           GBP3 -25.75090  0.000  0.610914
Dropped 5 / 119174 bins on chromosome 1
Traceback (most recent call last):
  File "/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/bin/cnvkit.py", line 13, in <module>
    args.func(args)
  File "/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/lib/python2.7/site-packages/cnvlib/commands.py", line 608, in _cmd_segment
    processes=args.processes)
  File "/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/lib/python2.7/site-packages/cnvlib/segmentation/__init__.py", line 53, in do_segmentation
    for _, ca in cnarr.by_arm())))
  File "/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/lib/python2.7/site-packages/cnvlib/parallel.py", line 26, in map
    return map(func, iterable)
  File "/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/lib/python2.7/site-packages/cnvlib/segmentation/__init__.py", line 77, in _ds
    return _do_segmentation(*args)
  File "/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/lib/python2.7/site-packages/cnvlib/segmentation/__init__.py", line 142, in _do_segmentation
    seg_out = core.call_quiet('Rscript', '--vanilla', script_fname)
  File "/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/lib/python2.7/site-packages/cnvlib/core.py", line 36, in call_quiet
    % (' '.join(args), err))
RuntimeError: Subprocess command failed:
$ Rscript --vanilla /tmp/tmp1kAVTC

Loading probe coverages into a data frame
Segmenting the probe data
Error in rep(0, max.ones * (max.ones + 1)/2) : invalid 'times' argument
Calls: segment -> getbdry
Execution halted

Any Ideas what is going on?

Thanks!

ahwanpandey commented 6 years ago

Hi again,

The above was run using CNVkit 0.9.1

I just updated to 0.9.4 and got the same error.

(virtualenv_python_2.7.5) [apandey@papr-expanded01 ]$ cnvkit.py segment MAOC00944-3-6.cnr -o MAOC00944-3-6.cns -t 1e5 -p 1
Segmenting with method 'cbs', significance threshold 100000.0, in 1 processes
/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/lib/python2.7/site-packages/scipy/signal/_arraytools.py:45: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  b = a[a_slice]
Traceback (most recent call last):
  File "/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/bin/cnvkit.py", line 13, in <module>
    args.func(args)
  File "/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/lib/python2.7/site-packages/cnvlib/commands.py", line 630, in _cmd_segment
    processes=args.processes)
  File "/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/lib/python2.7/site-packages/cnvlib/segmentation/__init__.py", line 62, in do_segmentation
    for _, ca in cnarr.by_arm())))
  File "/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/lib/python2.7/site-packages/cnvlib/parallel.py", line 26, in map
    return map(func, iterable)
  File "/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/lib/python2.7/site-packages/cnvlib/segmentation/__init__.py", line 87, in _ds
    return _do_segmentation(*args)
  File "/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/lib/python2.7/site-packages/cnvlib/segmentation/__init__.py", line 159, in _do_segmentation
    seg_out = core.call_quiet(rscript_path, '--vanilla', script_fname)
  File "/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/lib/python2.7/site-packages/cnvlib/core.py", line 36, in call_quiet
    % (' '.join(args), err))
RuntimeError: Subprocess command failed:
$ Rscript --vanilla /tmp/tmpMYm6cD

Loading probe coverages into a data frame
Segmenting the probe data
Error in rep(0, max.ones * (max.ones + 1)/2) : invalid 'times' argument
Calls: segment -> getbdry
Execution halted
ahwanpandey commented 6 years ago

OK I think I figured it. The cuprit was the value I was using for "-t". I removed it altogether and it works.

Doesn't work

cnvkit.py segment MAOC00944-3-6.cnr -o MAOC00944-3-6.cns -t 1e5 -p 1

Works

cnvkit.py segment MAOC00944-3-6.cnr -o MAOC00944-3-6.cns -p 1

I got the "-t 1e5" from:

https://github.com/etal/cnvkit-examples/issues/1

Another issue now, it looks like I can't access the help info, wanted to check what values are acceptable for "-t" and what the default is.

(virtualenv_python_2.7.5) [apandey@papr-expanded01 ]$ cnvkit.py segment -h
Traceback (most recent call last):
  File "/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/bin/cnvkit.py", line 12, in <module>
    args = commands.parse_args()
  File "/researchers/ahwan.pandey/Tools/virtualenv_python_2.7.5/lib/python2.7/site-packages/cnvlib/commands.py", line 1823, in parse_args
    return AP.parse_args(args=args)
  File "/usr/lib64/python2.7/argparse.py", line 1688, in parse_args
    args, argv = self.parse_known_args(args, namespace)
  File "/usr/lib64/python2.7/argparse.py", line 1720, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/usr/lib64/python2.7/argparse.py", line 1908, in _parse_known_args
    positionals_end_index = consume_positionals(start_index)
  File "/usr/lib64/python2.7/argparse.py", line 1885, in consume_positionals
    take_action(action, args)
  File "/usr/lib64/python2.7/argparse.py", line 1794, in take_action
    action(self, namespace, argument_values, option_string)
  File "/usr/lib64/python2.7/argparse.py", line 1090, in __call__
    namespace, arg_strings = parser.parse_known_args(arg_strings, namespace)
  File "/usr/lib64/python2.7/argparse.py", line 1720, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/usr/lib64/python2.7/argparse.py", line 1926, in _parse_known_args
    start_index = consume_optional(start_index)
  File "/usr/lib64/python2.7/argparse.py", line 1866, in consume_optional
    take_action(action, args, option_string)
  File "/usr/lib64/python2.7/argparse.py", line 1794, in take_action
    action(self, namespace, argument_values, option_string)
  File "/usr/lib64/python2.7/argparse.py", line 994, in __call__
    parser.print_help()
  File "/usr/lib64/python2.7/argparse.py", line 2327, in print_help
    self._print_message(self.format_help(), file)
  File "/usr/lib64/python2.7/argparse.py", line 2301, in format_help
    return formatter.format_help()
  File "/usr/lib64/python2.7/argparse.py", line 279, in format_help
    help = self._root_section.format_help()
  File "/usr/lib64/python2.7/argparse.py", line 209, in format_help
    func(*args)
  File "/usr/lib64/python2.7/argparse.py", line 209, in format_help
    func(*args)
  File "/usr/lib64/python2.7/argparse.py", line 515, in _format_action
    help_text = self._expand_help(action)
  File "/usr/lib64/python2.7/argparse.py", line 601, in _expand_help
    return self._get_help_string(action) % params
TypeError: float argument required, not str
etal commented 6 years ago

Sorry for the trouble, here's the segment help text:

usage: cnvkit.py segment [-h] [-o FILENAME] [-d DATAFRAME]
                         [-m {cbs,flasso,haar,none,hmm,hmm-tumor,hmm-germline}]
                         [-t THRESHOLD] [--drop-low-coverage]
                         [--drop-outliers FACTOR] [--rscript-path PATH]
                         [-p [PROCESSES]] [-v FILENAME] [-i SAMPLE_ID]
                         [-n NORMAL_ID]
                         [--min-variant-depth MIN_VARIANT_DEPTH]
                         [-z [ALT_FREQ]]
                         filename

positional arguments:
  filename              Bin-level log2 ratios (.cnr file), as produced by
                        'fix'.

optional arguments:
  -h, --help            show this help message and exit
  -o FILENAME, --output FILENAME
                        Output table file name (CNR-like table of segments,
                        .cns).
  -d DATAFRAME, --dataframe DATAFRAME
                        File name to save the raw R dataframe emitted by CBS
                        or Fused Lasso. (Useful for debugging.)
  -m {cbs,flasso,haar,none,hmm,hmm-tumor,hmm-germline}, --method {cbs,flasso,haar,none,hmm,hmm-tumor,hmm-germline}
                        Segmentation method (CBS, fused lasso, haar wavelet,
                        HMM), or 'none' for chromosome arm-level averages as
                        segments. [Default: cbs]
  -t THRESHOLD, --threshold THRESHOLD
                        Significance threshold (p-value or FDR, depending on
                        method) to accept breakpoints during segmentation. For
                        HMM methods, this is the smoothing window size.
  --drop-low-coverage   Drop very-low-coverage bins before segmentation to
                        avoid false-positive deletions in poor-quality tumor
                        samples.
  --drop-outliers FACTOR
                        Drop outlier bins more than this many multiples of the
                        95th quantile away from the average within a rolling
                        window. Set to 0 for no outlier filtering. [Default:
                        10]
  --rscript-path PATH   Path to the Rscript excecutable to use for running R
                        code. Use this option to specify a non-default R
                        installation. [Default: Rscript]
  -p [PROCESSES], --processes [PROCESSES]
                        Number of subprocesses to segment in parallel. Give 0
                        or a negative value to use the maximum number of
                        available CPUs. [Default: use 1 process]

To additionally segment SNP b-allele frequencies:
  -v FILENAME, --vcf FILENAME
                        VCF file name containing variants for segmentation by
                        allele frequencies.
  -i SAMPLE_ID, --sample-id SAMPLE_ID
                        Specify the name of the sample in the VCF (-v/--vcf)
                        to use for b-allele frequency extraction and as the
                        default plot title.
  -n NORMAL_ID, --normal-id NORMAL_ID
                        Corresponding normal sample ID in the input VCF
                        (-v/--vcf). This sample is used to select only
                        germline SNVs to plot b-allele frequencies.
  --min-variant-depth MIN_VARIANT_DEPTH
                        Minimum read depth for a SNV to be displayed in the
                        b-allele frequency plot. [Default: 20]
  -z [ALT_FREQ], --zygosity-freq [ALT_FREQ]
                        Ignore VCF's genotypes (GT field) and instead infer
                        zygosity from allele frequencies. [Default if used
                        without a number: 0.25]

I think you want -t 1e-5 instead of -t 1e5.