Closed kcamnairb closed 1 year ago
Thanks, just adding the reprex
library(tidyverse)
library(valr)
tibble::tribble(
~chrom, ~start, ~end,
"chr1", 5, 20,
"chr1", 30, 40
) %>%
bed_cluster(max_dist = 10)
#> # A tibble: 2 × 4
#> chrom start end .id
#> <chr> <dbl> <dbl> <int>
#> 1 chr1 5 20 1
#> 2 chr1 30 40 1
tibble::tribble(
~chrom, ~start, ~end,
"chr1", 5, 20,
"chr1", 30, 40,
"chr1", 1, 10
) %>%
bed_cluster(max_dist = 10)
#> # A tibble: 3 × 4
#> chrom start end .id
#> <chr> <dbl> <dbl> <int>
#> 1 chr1 1 10 1
#> 2 chr1 5 20 1
#> 3 chr1 30 40 2
Created on 2023-04-05 with reprex v2.0.2
Thanks for reporting this bug. This should now be fixed in the main branch, which you can install via devtools.
# install.packages("devtools")
devtools::install_github('rnabioco/valr')
It works great! Thanks you.
Sorry, I'm still having the same issue with different data. All the intervals below should cluster together.
library(tidyverse)
library(valr)
tibble::tribble(
~chrom, ~start, ~end,
"scaffold_66", 27262, 70396,
"scaffold_66", 66594, 67647,
"scaffold_66", 82218, 85280,
"scaffold_66", 85878, 87553,
"scaffold_66", 87831, 89885,
"scaffold_66", 90498, 91996
) %>%
bed_cluster(max_dist = 20000)
#> # A tibble: 6 × 4
#> chrom start end .id
#> <chr> <dbl> <dbl> <int>
#> 1 scaffold_66 27262 70396 1
#> 2 scaffold_66 66594 67647 1
#> 3 scaffold_66 82218 85280 1
#> 4 scaffold_66 85878 87553 1
#> 5 scaffold_66 87831 89885 1
#> 6 scaffold_66 90498 91996 2
Thanks for reopening with the additional example. bed_cluster
needs additional tests to avoid these bugs. Hopefully I can have a fix for you in the next few days.
this should be fixed now, thanks again for reporting and please reopen if you find this issue unresolved on additional datasets.
Hi, I found some strange output with bed_cluster where if an interval that is further away is included, other intervals are no longer clustered together. You can see in the example below that with max_dist set to 10, intervals 5-20 and 30-40 cluster together, but when interval 1-10 is included intervals 5-20 and 30-40 no longer cluster together.