rnabioco / valr

Genome Interval Arithmetic in R
http://rnabioco.github.io/valr/
Other
88 stars 25 forks source link

chrom in x missing in y is dropped from bed_coverage output #395

Closed kcamnairb closed 1 year ago

kcamnairb commented 1 year ago

Hi,

I think I found some unexpected behavior in bed_coverage. If an interval in x is on a chromosome that doesn't exist in y I would still expect it to be present in the output showing zero coverage. However, it is dropped from the output. Below, interval "chr3", 800, 900, '-' is missing from the output, but "chr2", 800, 900, '-' is showing zero coverage. I'm running version 0.6.6.

library(valr)
x <- tibble::tribble(
  ~chrom, ~start, ~end, ~strand,
  "chr1", 100,    500,  '+',
  "chr2", 200,    400,  '+',
  "chr2", 300,    500,  '-',
  "chr2", 800,    900,  '-',
  "chr3", 800,    900,  '-'
)

y <- tibble::tribble(
  ~chrom, ~start, ~end, ~value, ~strand,
  "chr1", 150,    400,  100,    '+',
  "chr1", 500,    550,  100,    '+',
  "chr2", 230,    430,  200,    '-',
  "chr2", 350,    430,  300,    '-'
)
bed_coverage(x, y)
jayhesselberth commented 1 year ago

Thanks, just adding the reprex, will take a look soon

library(valr)
x <- tibble::tribble(
  ~chrom, ~start, ~end, ~strand,
  "chr1", 100,    500,  '+',
  "chr2", 200,    400,  '+',
  "chr2", 300,    500,  '-',
  "chr2", 800,    900,  '-',
  "chr3", 800,    900,  '-'
)

y <- tibble::tribble(
  ~chrom, ~start, ~end, ~value, ~strand,
  "chr1", 150,    400,  100,    '+',
  "chr1", 500,    550,  100,    '+',
  "chr2", 230,    430,  200,    '-',
  "chr2", 350,    430,  300,    '-'
)
bed_coverage(x, y)
#> # A tibble: 4 × 8
#>   chrom start   end strand .ints  .cov  .len .frac
#>   <chr> <dbl> <dbl> <chr>  <int> <int> <int> <dbl>
#> 1 chr1    100   500 +          2   250   400 0.625
#> 2 chr2    200   400 +          2   170   200 0.85 
#> 3 chr2    300   500 -          2   130   200 0.65 
#> 4 chr2    800   900 -          0     0   100 0

Created on 2023-02-03 with reprex v2.0.2

kriemo commented 1 year ago

Thanks for reporting this bug. This issue should now be fixed on the main branch. You can install the updated package using devtools.

devtools::install_github('rnabioco/valr')
library(valr)
x <- tibble::tribble(
  ~chrom, ~start, ~end, ~strand,
  "chr1", 100,    500,  '+',
  "chr2", 200,    400,  '+',
  "chr2", 300,    500,  '-',
  "chr2", 800,    900,  '-',
  "chr3", 800,    900,  '-'
)

y <- tibble::tribble(
  ~chrom, ~start, ~end, ~value, ~strand,
  "chr1", 150,    400,  100,    '+',
  "chr1", 500,    550,  100,    '+',
  "chr2", 230,    430,  200,    '-',
  "chr2", 350,    430,  300,    '-'
)
bed_coverage(x, y)
#> # A tibble: 5 × 8
#>   chrom start   end strand .ints  .cov  .len .frac
#>   <chr> <dbl> <dbl> <chr>  <int> <int> <dbl> <dbl>
#> 1 chr1    100   500 +          2   250   400 0.625
#> 2 chr2    200   400 +          2   170   200 0.85 
#> 3 chr2    300   500 -          2   130   200 0.65 
#> 4 chr2    800   900 -          0     0   100 0    
#> 5 chr3    800   900 -          0     0   100 0

Created on 2023-02-04 with reprex v2.0.2