HenrikBengtsson / PSCBS

🔬 R package: Analysis of Parent-Specific DNA Copy Numbers
https://cran.r-project.org/package=PSCBS
7 stars 4 forks source link

segmentByCBS(..., w=w, knownSegments=segs) gives validation error #28

Closed HenrikBengtsson closed 9 years ago

HenrikBengtsson commented 9 years ago

Minimal "reproducible" example in https://github.com/etal/cnvkit/issues/47.

Example

> library(PSCBS)
> data <- readRDS("PSCBS_issue28.rds")
> str(data)
'data.frame':   1498 obs. of  4 variables:
 $ chromosome: int  1 1 1 1 1 1 1 1 1 1 ...
 $ x         : int  464423 676306 882963 1086530 1508981 1512177 1715329 1919111
 2123226 2407978 ...
 $ y         : num  -0.631 -0.374 -0.298 -0.66 -0.542 ...
 $ w         : num  0.316 0.871 0.969 0.965 0.492 ...

> segs <- data.frame(chromosome=1L, start=c(-Inf, 121049953), end=c(121049952, 142517941))
> segs
  chromosome     start       end
1          1      -Inf 121049952
2          1 121049953 142517941

> fit <- segmentByCBS(data, knownSegments=segs)
Segmenting by CBS...
 Chromosome: 1
Error: !hasWeights || !is.null(fit$data$w) is not TRUE
Segmenting by CBS...done

Troubleshooting

> fit <- segmentByCBS(data, knownSegments=segs, verbose=-100)
[...]
   Segmenting by CBS...done
     sampleName chromosome     start       end nbrOfLoci    mean
   1       <NA>          1    464423   3004886        34 -0.5345
   2       <NA>          1   3004886 119434332       755 -0.2217
   3       <NA>          1 119434332 121049952        38  0.3596
  Segment #1 ('chr1:(-Inf,121049952)') of 2...done
  Segment #2 ('chr1:(121049953,142517941)') of 2...
   'data.frame':        0 obs. of  5 variables:
    $ chrom: int
    $ x    : num
    $ y    : num
    $ index: int
    $ w    : num
Error: !hasWeights || !is.null(fit$data$w) is not TRUE
  Segment #2 ('chr1:(121049953,142517941)') of 2...done
 Segmenting multiple segments on current chromosome...done
Segmenting by CBS...done

Conclusion

This is probably a special case where data field w happens to be dropped because there are exactly zero loci. I'll investigate and fix. This problem appears in PSCBS 0.45.0 only.

CC: @tskir, @etal

HenrikBengtsson commented 9 years ago

Solving Issue #29, seem to have solved this issue, e.g.

> library(PSCBS)
> data <- readRDS("PSCBS_issue28.rds")
> segs <- data.frame(chromosome=1L, start=c(-Inf, 121049953), end=c(121049952, 142517941))
> fit <- segmentByCBS(data, knownSegments=segs)
> fit
  sampleName chromosome     start       end nbrOfLoci    mean
1       <NA>          1    464423   3004886        34 -0.5345
2       <NA>          1   3004886 119434332       755 -0.2217
3       <NA>          1 119434332 121049952        38  0.3596
4       <NA>          1 121049953 142517941         0      NA