dgrtwo / fuzzyjoin

Join tables together on inexact matching
Other
668 stars 61 forks source link

distance_inner_join: Error in rowSums((v1 - v2)^2) : 'x' must be an array of at least two dimensions #43

Closed scottgigante closed 5 years ago

scottgigante commented 6 years ago

I have two tibbles with a column called pos on which I am trying to perform a distance_inner_join.

> df1
# A tibble: 1,398 x 3
   gene_name     strand       pos
   <chr>         <fct>      <dbl>
 1 1500015O10Rik +       43730601
 2 1700001G17Rik +       33669823
 3 1700003I22Rik +       56018977
 4 1700007P06Rik +      187125137
 5 1700016C15Rik +      177729813
 6 1700016L21Rik +       80445931
 7 1700019A02Rik -       53158746
 8 1700019D03Rik -       52925246
 9 1700019P21Rik -      138876645
10 1700020N18Rik -       91405182
# ... with 1,388 more rows
> df2
# A tibble: 58 x 3
   type      areaStat       pos
   <chr>        <dbl>     <dbl>
 1 imprinted    -28.5  37977873
 2 imprinted    -23.4  55027489
 3 imprinted    -21.5 123876627
 4 imprinted    -19.2  95665495
 5 imprinted     18.3 172148438
 6 imprinted     18.4  97938015
 7 imprinted     18.5  31347008
 8 imprinted     20.7  96774378
 9 imprinted     21.3 154282393
10 imprinted     23.0 118973902
# ... with 48 more rows

When I run distance_inner_join, I get the following error:

> distance_inner_join(
+   df1, 
+   df2, 
+   by="pos", max_dist=1e6)
Error in rowSums((v1 - v2)^2) : 
  'x' must be an array of at least two dimensions

and the traceback:

6. stop("'x' must be an array of at least two dimensions")
5. rowSums((v1 - v2)^2)
4. multi_match_fun(ux_input, uy_input)
3. fuzzy_join(x, y, multi_by = by, multi_match_fun = match_fun, mode = mode)
2. distance_join(x, y, by, max_dist = max_dist, method = method, mode = "inner", distance_col = distance_col)
1. distance_inner_join(gene_tss %>% filter(chr == .x) %>% select(-chr), dmr_midpoint %>% filter(chr == .x) %>% select(-chr), by = "pos", max_dist = 1e+06)

Installed the latest version from GitHub today 2018-07-23.

akarlinsky commented 5 years ago

I have the same issue. Have you found a solution?

dgrtwo commented 5 years ago

This is now fixed in the GitHub version (likely up on CRAN this week). thanks for the report!

BTW, this happened because there was only one dimension. A workaround would have been to use difference_inner_join instead with the same arguments.

scottgigante commented 5 years ago

Thanks for the fix and the explanation!