ness-lab / mrt

0 stars 2 forks source link

Try using E(r2) eqn from Hill and Weir 1988 to fit for genome-wide recombination rate #31

Open aays opened 4 years ago

aays commented 4 years ago

Another idea from lab meeting. Could be a lot faster than full LDhelmet runs when all we really care about is the genome-wide recombination rate, although if we adopt this it'll probably be a good idea to do a few test runs with both to assess concordance.

Paper in question - see bottom of Appendix 2

aays commented 4 years ago

Also - for reference - here's the snippet of code I used when doing this to look at LD decay in C. reinhardtii:

library(readr)
library(dplyr)
library(stringr)
library(magrittr)
library(fs)
library(purrr)

fnames <- dir_ls(regexp = 'chromosome_[0-9]{1,2}\\.csv')

weir_hill <- function(fname) {

    chrname <- str_extract(fname, 'chromosome_[0-9]{1,2}')
    outname <- paste0(chrname, '_fit.csv')
    equation <- '((10 + p*d)/(22 + (13*p*d) + (p*d)^2))*(1 + (((3 + (p*d))/(24*(22 + (13*p*d) +\
                   (p*d)^2))) * (12 + (12*p*d) + (p*d)^2)))' %>%
                   str_replace(., '\\n', '') %>%
                   str_replace(' {2,}', '')

    d <- read_csv(fname, col_types = cols()) %>%
        mutate(d = abs(pos2 - pos1)) %>%
        select(d, r2)
    predicted_decay <- nls(
            paste('r2', '~', equation),
            data = d, control = list(maxiter = 500), start = list(p = 0.5)
        ) %>%
        predict() %>%
        as_tibble()
    colnames(predicted_decay) <- 'r2'
    predicted_decay$d <- d$d

    predicted_decay %<>%
        group_by(d) %>%
        summarise(r2 = mean(r2)) %>%
        arrange(d) %>%
        mutate(chrom = chrname) %>%
        select(chrom, d, r2)

    write_csv(predicted_decay, outname)
}

fnames %>%
    walk(~ weir_hill(.))

From here