signaturescience / skater

SKATE R Utilities
https://signaturescience.github.io/skater/
Other
9 stars 4 forks source link

New function to order IDs, and put into all read_ functions #39

Closed stephenturner closed 3 years ago

stephenturner commented 3 years ago

We can't be sure that .fam files, akt results, others (IBIS, KING, PLINK, etc) will always give us e.g.

id1 id2 results
a b ???
a c ???
b c ???

That is, one tool might give us results for id1=b, id2=a. Then, joining by=c("id1", "id2") won't work. The function below will do this, but improve on this with tidy evaluation

order_ids <- function(.data) {
  .data %>% 
    mutate(.x1=pmin(id1, id2), .x2=pmax(id1, id2)) %>% 
    mutate(id1=.x1, id2=.x2) %>% 
    select(-.x1, -.x2)
}
stephenturner commented 3 years ago

This essentially does it

# Function
order_ids <- function(.data, .id1, .id2) {
  .id1=rlang::enquo(.id1)
  .id2=rlang::enquo(.id2)
  .data %>% 
    dplyr::mutate(.x1=pmin(!!.id1, !!.id2)) %>% 
    dplyr::mutate(.x2=pmax(!!.id1, !!.id2)) %>% 
    dplyr::mutate(!!.id1:=.x1, !!.id2:=.x2) %>% 
    dplyr::select(-.x1, -.x2)
}

# examples
d1 <- tibble::tribble(
  ~id1, ~id2, ~results1,
   "a",  "b",       10L,
   "a",  "c",       20L,
   "c",  "b",       30L
)
d2 <- tibble::tribble(
  ~id1, ~id2,  ~results2,
   "b",  "a",       101L,
   "c",  "a",       201L,
   "b",  "c",       301L
)

d1 %>% order_ids(id1, id2)
d2 %>% order_ids(id1, id2)
list(d1, d2) %>% purrr::map(order_ids, id1, id2) %>% purrr::reduce(dplyr::inner_join)
dplyr::inner_join(order_ids(d1, id1, id2), order_ids(d2, id1, id2))