yoonsucho / CAMERA

Mendelian Randomization approach using Cross Ancestral Model
Other
6 stars 5 forks source link

Mutate: object 'position' not found #9

Closed yoonsucho closed 3 years ago

yoonsucho commented 3 years ago

The issue with the mutate function occurs when the following phenotypes are used:

  > c <- MultiAncestrySummarySet$new(
  >   exposure_ids=c("ukb-b-5779", "ukb-e-1558_EAS"), 
  >   outcome_ids=c("ieu-a-2", "bbj-a-1"),  ##
  >   bfiles=c("/work/yc16575/1kg_imp_gwasglue/EUR", "/work/yc16575/1kg_imp_gwasglue/EAS"), 
  >   plink = genetics.binaRies::get_plink_binary(), 
  >   radius=50000, 
  >   clump_pop="EAS"
  > )
  > c$extract_instruments()
  > c$extract_instrument_regions()  

Error message:

Error: arrange() failed at implicit mutate() step.

  • Problem with mutate() input #..1. ✖ object 'position' not found ℹ Input ..1 is position. Run rlang::last_error() to see where the error occurred.

The error seems to occur from the line 87-90 (MultiAncestrySummarySet.R):

  >     instrument_raw <- ieugwasr::variants_rsid(unique(instrument_raw$SNP)) %>%
  >       dplyr::select(SNP=query, chr, **position=pos**) %>%
  >       dplyr::inner_join(., instrument_raw, by="SNP") %>%
  >       dplyr::arrange(id.exposure, chr, position)
yoonsucho commented 3 years ago

The function works fine most of the time but randomly fails

yoonsucho commented 3 years ago

Error in api_query("associations", query = list(variant = variants, id = id, : The query to MR-Base exceeded 300 seconds and timed out. Please simplify the query

-> generate a chunk?

yoonsucho commented 3 years ago

@explodecomputer Could you please confirm this if possible? I've been trying to apply Try / tryCatch() but not sure where the function should be located. Also, between try and tryCatch(), I'm not sure if which function should be used. Below is the script that I have revised. I have marked the points where I was confused with # mark:

extract_instrument_regions = function(radius=self$radius, instrument_raw=self$instrument_raw, exposure_ids=self$exposure_ids) {

return a list of lists e.g.

# region1:
# pop1:
# data frame
# pop2:
# data frame

# create list of regions in chr:pos format
regions <- unique(paste0(instrument_raw$chr, ":", instrument_raw$position-radius, "-", instrument_raw$position+radius))

# Lookup each region in each exposure
self$instrument_regions <- lapply(regions, function(r) { 

##########Should I put tryCatch(){} here rather than insert tryCatch into each lappy loop? message(r) a <- lapply(self$exposure_ids, function(id) { tryCatch({ ##########Would try() be better to ignore the error caused by the server? message(id) ieugwasr::associations(r, id) %>% dplyr::arrange(position) %>% dplyr::filter(!duplicated(rsid)) }, error = function(e) { message(paste0("Error occured in ", r, ": Skipping ", r)) ##########I would like to show the message when the the error occurs to indicate some regions (presented as "r" here) were excluded due to the error. return(NULL)}) })

  # subset to keep only the same SNPs across datasets
  # Make sure they have the same effect and other alleles
  rsids <- Reduce(intersect, lapply(a, function(x) try(x$rsid)))
  a <- lapply(a, function(x) {
    tryCatch({
    subset(x, rsid %in% rsids)
            }, error = function(e) return(NULL))
  })

  # check effect and other alleles are the same
  ea <- a[[1]]$ea
  a <- lapply(a, function(x) {
    tryCatch({
    index <- x$ea != ea
    if(sum(index) > 0)
    {
      x$beta[index] <- x$beta[index] * -1
      nea <- x$nea[index]
      x$nea[index] <- x$ea[index]
      x$ea[index] <- x$nea[index]
      x$eaf[index] <- 1-x$eaf[index]
      x <- subset(x, nea == a[[1]]$nea)
    }
    return(x)
              }, error = function(e) return(NULL))
  })
  rsids <- Reduce(intersect, lapply(a, function(x) try(x$rsid)))
  a <- lapply(a, function(x) {
    tryCatch({
    subset(x, rsid %in% rsids) %>%
      dplyr::arrange(chr, position)
              }, error = function(e) return(NULL))
  })
  names(a) <- self$exposure_ids
  return(a)
})
names(self$instrument_regions) <- unique(instrument_raw$rsid)

},

yoonsucho commented 3 years ago

Kind of resolved, still not sure if it is okay to omit SNPs in the excluded region (due to server error).