rwdavies / STITCH

STITCH - Sequencing To Imputation Through Constructing Haplotypes
http://www.nature.com/ng/journal/v48/n8/abs/ng.3594.html
GNU General Public License v3.0
73 stars 19 forks source link

possible error in validating the genetic map #94

Open Ahhgust opened 4 months ago

Ahhgust commented 4 months ago

Hi! I'm trying to re-use one of our standard genetic maps w/ QUILT (which in turns uses STITCH), and some of the genetic map files are generating the following error. The example (I think) stands on its own, and the extra line of R code to show that the error message (by my eye) doesn't quite track.

I think it has something to do with the genetic map not starting at 0 (presumably the 0 was lost in the liftover), but I cannot say for certain.

foo <- get_and_validate_genetic_map("ForQuilt/chr22.b38.gmap.gz") position COMBINED_rate.cM.Mb. Genetic_Map.cM. 1 16370978 0.5018727 1.585395 2 16372046 0.4959920 1.585931 Error in validate_genetic_map(genetic_map) : Error interpreting genetic map around row 1. Provided genetic map does not satisfy expectations. See above print. Let g be the genetic map with 3 columns. In 1-based coordinates, recall that g[iRow, 3] must equal g[iRow - 1, 3] + ( (g[iRow, 1] - g[iRow - 1, 1]) g[iRow - 1, 2]) (1.585395 + ( (16372046-16370978)/1e6 0.5018727) ) [1] 1.585931

The most likely answer is that I managed to muck something up, but for the life of me I cannot see what the error is. Can you take a look? Thanks! -August

rwdavies commented 4 months ago

I think it is just that the way I wrote the code requires the first value to be 0

For example this code gives errors / works as commented below

source("https://raw.githubusercontent.com/rwdavies/STITCH/master/STITCH/R/genetic-map.R")
genetic_map <- rbind(
    c(16370978, 0.5018727, 1.585395),
    c(16372046, 0.4959920, 1.585931)
)
colnames(genetic_map) <- c("position", "COMBINED_rate.cM.Mb.", "Genetic_Map.cM.")
validate_genetic_map(genetic_map) ## throws error
genetic_map[, 3] <- genetic_map[, 3] - genetic_map[1, 3]
validate_genetic_map(genetic_map) ## is fine

Hopefully that's an easy fix for your text files, just read in and remove the genetic map value at the first position.

Thanks! Robbie